Re: [PR] Use pager and allow configuration via `\pset` [datafusion]

2025-06-13 Thread via GitHub
djellemah commented on PR #15597: URL: https://github.com/apache/datafusion/pull/15597#issuecomment-2970492404 Merge conflict introduced by f40e0db161 on 02-may-2025 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] [Spark SQL] Fix InsertSuite failure when using native_iceberg_compat with Spark 3.4.3 [datafusion-comet]

2025-06-13 Thread via GitHub
mbutrovich closed issue #1875: [Spark SQL] Fix InsertSuite failure when using native_iceberg_compat with Spark 3.4.3 URL: https://github.com/apache/datafusion-comet/issues/1875 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] fix: cast_struct_to_struct aligns to Spark behavior [datafusion-comet]

2025-06-13 Thread via GitHub
mbutrovich merged PR #1879: URL: https://github.com/apache/datafusion-comet/pull/1879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[I] Comet should choose the most suitable Parquet scan implementation automatically [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove opened a new issue, #1881: URL: https://github.com/apache/datafusion-comet/issues/1881 ### What is the problem the feature request solves? We currently support three different Parquet scan implementations: - `native_comet` - `native_datafusion` - `native_iceberg_

Re: [PR] Feat/ffi scalar udf [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1033: Feat/ffi scalar udf URL: https://github.com/apache/datafusion-python/pull/1033 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] enhance sql-using-python-udf example [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1054: enhance sql-using-python-udf example URL: https://github.com/apache/datafusion-python/pull/1054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] enhance sql-using-python-udf example [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1054: URL: https://github.com/apache/datafusion-python/pull/1054#issuecomment-2970805134 Closing since no activity. Please reopen if this continues to be an issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] 1065/enhancement/add ctx to `__init__.py` [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1072: 1065/enhancement/add ctx to `__init__.py` URL: https://github.com/apache/datafusion-python/pull/1072 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] 1065/enhancement/add ctx to `__init__.py` [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1072: URL: https://github.com/apache/datafusion-python/pull/1072#issuecomment-2970807273 Closing since no activity. Please reopen PR if this continues to be a user's need. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] fix: improve CSV path handling and error handling in substrait example [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1073: fix: improve CSV path handling and error handling in substrait example URL: https://github.com/apache/datafusion-python/pull/1073 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2971305067 🤖: Benchmark completed Details ``` Comparing HEAD and topk-dynamic-filters Benchmark sort_tpch.json ┏━━

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2971297789 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubun

Re: [I] Add statistics to ParquetExec for *files* pruned [datafusion]

2025-06-13 Thread via GitHub
adriangb commented on issue #16402: URL: https://github.com/apache/datafusion/issues/16402#issuecomment-2971321730 Doesn't https://github.com/apache/datafusion/blob/4dd6923787084548c9ecc6d90c630c2c28ee9259/datafusion/datasource-parquet/src/metrics.rs#L30-L33 cover that? -- This is an aut

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #15770: URL: https://github.com/apache/datafusion/pull/15770#discussion_r2145790872 ## datafusion/core/tests/fuzz_cases/topk_filter_pushdown.rs: ## @@ -0,0 +1,387 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] fix: correctly handle schemas with nested array of struct (native_iceberg_compat) [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on code in PR #1883: URL: https://github.com/apache/datafusion-comet/pull/1883#discussion_r2145878327 ## spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala: ## @@ -1745,6 +1746,77 @@ abstract class ParquetReadSuite extends CometTestBase {

Re: [PR] fix: correctly handle schemas with nested array of struct (native_iceberg_compat) [datafusion-comet]

2025-06-13 Thread via GitHub
codecov-commenter commented on PR #1883: URL: https://github.com/apache/datafusion-comet/pull/1883#issuecomment-2971338111 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1883?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Use Tokio's task budget consistently [datafusion]

2025-06-13 Thread via GitHub
ozankabak commented on PR #16398: URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2970506512 Thanks for the draft -- this is inline with my understanding from your description. I think it will inch us closer to a good, lasting solution (especially after your upstream tokio

Re: [PR] Migrate core test to insta, part1 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #16324: URL: https://github.com/apache/datafusion/pull/16324#discussion_r2145201108 ## datafusion/core/tests/sql/explain_analyze.rs: ## @@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() { let formatted = arrow::util::pretty::prett

Re: [PR] Add fast paths for try_process_unnest [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #16389: URL: https://github.com/apache/datafusion/pull/16389#issuecomment-2970575208 🤖: Benchmark completed Details ``` group main opt-process-unnest -

Re: [I] Support reading multiple parquet files via `datafusion-cli` [datafusion]

2025-06-13 Thread via GitHub
robtandy commented on issue #16303: URL: https://github.com/apache/datafusion/issues/16303#issuecomment-2970581452 Thanks for creating this issue @alamb !! Regarding the location of the code, if it is in datafusion proper rather than the CLI, it would be available in datafusion pyt

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2970575427 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubun

Re: [PR] feat: pass ignore_nulls flag to first and last [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on PR #1866: URL: https://github.com/apache/datafusion-comet/pull/1866#issuecomment-2970625122 > All the tests for first and last are disabled currently: > > * [Re-enable tests for FIRST/LAST #1646](https://github.com/apache/datafusion-comet/issues/1646)

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2970721901 🤖: Benchmark completed Details ``` Comparing HEAD and topk-dynamic-filters Benchmark clickbench_extended.json

Re: [PR] build(deps): bump tokio from 1.41.1 to 1.44.2 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1099: build(deps): bump tokio from 1.41.1 to 1.44.2 in /examples/ffi-table-provider URL: https://github.com/apache/datafusion-python/pull/1099 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] build(deps): bump tokio from 1.41.1 to 1.44.2 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1099: URL: https://github.com/apache/datafusion-python/pull/1099#issuecomment-2970935652 Closing because https://github.com/apache/datafusion-python/pull/1143 updates us to 1.45 -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] build(deps): bump tokio from 1.41.1 to 1.44.2 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1099: URL: https://github.com/apache/datafusion-python/pull/1099#issuecomment-2970935721 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] feat: Add experimental auto mode for `COMET_PARQUET_SCAN_IMPL` [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on code in PR #1747: URL: https://github.com/apache/datafusion-comet/pull/1747#discussion_r2145246950 ## spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala: ## @@ -105,8 +105,49 @@ case class CometScanRule(session: SparkSession) extends Rule[Spa

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
Dandandan commented on code in PR #15770: URL: https://github.com/apache/datafusion/pull/15770#discussion_r2145258417 ## datafusion/physical-plan/src/topk/mod.rs: ## @@ -207,6 +221,7 @@ impl TopK { // Idea: filter out rows >= self.heap.max() early (before passing to `R

Re: [PR] Add Interruptible Query Execution in Jupyter via KeyboardInterrupt Support [datafusion-python]

2025-06-13 Thread via GitHub
kosiew commented on PR #1141: URL: https://github.com/apache/datafusion-python/pull/1141#issuecomment-2970789318 @timsaucer Thanks for the review and nipping the `panic`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Feat/ffi scalar udf [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1033: URL: https://github.com/apache/datafusion-python/pull/1033#issuecomment-2970802418 Superseded by https://github.com/apache/datafusion-python/pull/1145 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] WIP: scalar UDFs with metadata [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1110: URL: https://github.com/apache/datafusion-python/pull/1110#issuecomment-2971126021 Superseded by https://github.com/apache/datafusion-python/pull/1145 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] WIP: scalar UDFs with metadata [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1110: WIP: scalar UDFs with metadata URL: https://github.com/apache/datafusion-python/pull/1110 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
adriangb commented on code in PR #15770: URL: https://github.com/apache/datafusion/pull/15770#discussion_r2145547651 ## datafusion/physical-plan/src/sorts/sort.rs: ## @@ -843,6 +846,8 @@ pub struct SortExec { common_sort_prefix: Vec, /// Cache holding plan properties l

[PR] fix: correctly handle schemas with nested array of struct (native_iceberg_compat) [datafusion-comet]

2025-06-13 Thread via GitHub
parthchandra opened a new pull request, #1883: URL: https://github.com/apache/datafusion-comet/pull/1883 ## Which issue does this PR close? The mapping between Spark and Parquet for schemas with field ids did not correctly handle the schemas with nested arrays of structs. ## R

Re: [D] Search Pushdown (e.g. Vector Search) Into Table Providers [datafusion]

2025-06-13 Thread via GitHub
GitHub user niebayes added a comment to the discussion: Search Pushdown (e.g. Vector Search) Into Table Providers Cool GitHub link: https://github.com/apache/datafusion/discussions/16358#discussioncomment-13461933 This is an automatically sent email for [email protected]. To

Re: [PR] feat: Add experimental auto mode for `COMET_PARQUET_SCAN_IMPL` [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on code in PR #1747: URL: https://github.com/apache/datafusion-comet/pull/1747#discussion_r2145229513 ## spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala: ## @@ -105,8 +105,49 @@ case class CometScanRule(session: SparkSession) extends Rule[Spa

Re: [I] Fallback to Spark in native_datafusion/native_iceberg_compat if encryption is enabled [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on issue #1772: URL: https://github.com/apache/datafusion-comet/issues/1772#issuecomment-2970650359 This could also be handled in https://github.com/apache/datafusion-comet/issues/1881 -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Chore: implement hour func as ScalarUDFImpl [datafusion-comet]

2025-06-13 Thread via GitHub
trompa commented on PR #1874: URL: https://github.com/apache/datafusion-comet/pull/1874#issuecomment-2970687941 I thinn i can actually just select hour(now()), and parse the query plan, would that work? On Fri, Jun 13, 2025, 14:57 Matt Butrovich ***@***.***> wrote: > *mbutro

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2971173365 > > QQuery 25│ 380.03 ms │279.23 ms │ +1.36x faster > > ```sql > SELECT SearchPhrase FROM hits WHERE SearchPhrase <> '' ORDER BY SearchPhrase LIMIT 10; >

[I] Add statistics to ParquetExec for *files* pruned [datafusion]

2025-06-13 Thread via GitHub
alamb opened a new issue, #16402: URL: https://github.com/apache/datafusion/issues/16402 ### Is your feature request related to a problem or challenge? - This is a follow on to the feature added by @adriangb in https://github.com/apache/datafusion/pull/16014 @adriangb added th

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2971198838 I also filed a ticket to add a metric that we can use to see when file pruning is working: - https://github.com/apache/datafusion/issues/16402 -- This is an automated message from

Re: [PR] Add fast paths for try_process_unnest [datafusion]

2025-06-13 Thread via GitHub
simonvandel commented on code in PR #16389: URL: https://github.com/apache/datafusion/pull/16389#discussion_r2145802029 ## datafusion/sql/src/select.rs: ## @@ -374,6 +383,14 @@ impl SqlToRel<'_, S> { fn try_process_aggregate_unnest(&self, input: LogicalPlan) -> Result {

Re: [PR] fix: Remove `null_equals_null` todo in `NestedLoopJoin` [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #16390: URL: https://github.com/apache/datafusion/pull/16390#discussion_r2145093532 ## datafusion/physical-plan/src/joins/nested_loop_join.rs: ## @@ -718,8 +718,6 @@ struct NestedLoopJoinStream { inner_table: OnceFut, /// Information of in

Re: [PR] Add fast paths for try_process_unnest [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #16389: URL: https://github.com/apache/datafusion/pull/16389#issuecomment-2970396033 🤖 `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1015-gcp #15~

Re: [PR] Update Roadmap documentation [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #16399: URL: https://github.com/apache/datafusion/pull/16399#discussion_r2145081087 ## docs/source/contributor-guide/roadmap.md: ## @@ -46,81 +46,12 @@ make review efficient and avoid surprises. # Quarterly Roadmap -A quarterly roadmap will be

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2970425995 > @alamb could you kick off your benchmarks run on this branch? Sorry -- I missed this. I have queued them up and they should be running shortly -- This is an automated messag

Re: [PR] fix: Fixed error handling for `generate_series/range` [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #16391: URL: https://github.com/apache/datafusion/pull/16391#discussion_r2145088320 ## datafusion/functions-table/src/generate_series.rs: ## @@ -197,11 +197,17 @@ impl TableFunctionImpl for GenerateSeriesFuncImpl { } let mut norm

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2970437633 > The interesting bit is that this is now faster even with predicate pushdown turned off thanks to the late partition / stats based pruning @alamb !! For the case of a single partition

Re: [PR] build(deps): bump crossbeam-channel from 0.5.13 to 0.5.15 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1102: URL: https://github.com/apache/datafusion-python/pull/1102#issuecomment-2970943961 Superseded by https://github.com/apache/datafusion-python/pull/1143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] build(deps): bump crossbeam-channel from 0.5.13 to 0.5.15 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1102: URL: https://github.com/apache/datafusion-python/pull/1102#issuecomment-2970944042 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] build(deps): bump crossbeam-channel from 0.5.13 to 0.5.15 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1102: build(deps): bump crossbeam-channel from 0.5.13 to 0.5.15 in /examples/ffi-table-provider URL: https://github.com/apache/datafusion-python/pull/1102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] build(deps): bump tokio from 1.44.2 to 1.45.1 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1134: URL: https://github.com/apache/datafusion-python/pull/1134#issuecomment-2970939340 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] build(deps): bump uuid from 1.16.0 to 1.17.0 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1135: URL: https://github.com/apache/datafusion-python/pull/1135#issuecomment-2970947128 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] build(deps): bump uuid from 1.16.0 to 1.17.0 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1135: build(deps): bump uuid from 1.16.0 to 1.17.0 URL: https://github.com/apache/datafusion-python/pull/1135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] build(deps): bump uuid from 1.16.0 to 1.17.0 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1135: URL: https://github.com/apache/datafusion-python/pull/1135#issuecomment-2970947048 Superseded by https://github.com/apache/datafusion-python/pull/1143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] build(deps): bump mimalloc from 0.1.43 to 0.1.46 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1105: URL: https://github.com/apache/datafusion-python/pull/1105#issuecomment-2970946553 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] feat: Add experimental auto mode for `COMET_PARQUET_SCAN_IMPL` [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on code in PR #1747: URL: https://github.com/apache/datafusion-comet/pull/1747#discussion_r2145516448 ## spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala: ## @@ -105,8 +105,49 @@ case class CometScanRule(session: SparkSession) extends Rule[Spa

Re: [PR] build(deps): bump mimalloc from 0.1.43 to 0.1.46 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1105: URL: https://github.com/apache/datafusion-python/pull/1105#issuecomment-2970946246 Superseded by https://github.com/apache/datafusion-python/pull/1143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] build(deps): bump mimalloc from 0.1.43 to 0.1.46 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1105: build(deps): bump mimalloc from 0.1.43 to 0.1.46 URL: https://github.com/apache/datafusion-python/pull/1105 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
Dandandan commented on code in PR #15770: URL: https://github.com/apache/datafusion/pull/15770#discussion_r2145521489 ## datafusion/physical-plan/src/topk/mod.rs: ## @@ -207,6 +221,7 @@ impl TopK { // Idea: filter out rows >= self.heap.max() early (before passing to `R

Re: [PR] build(deps): bump tokio from 1.44.2 to 1.45.1 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1134: build(deps): bump tokio from 1.44.2 to 1.45.1 URL: https://github.com/apache/datafusion-python/pull/1134 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] build(deps): bump astral-sh/setup-uv from 5 to 6 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1090: build(deps): bump astral-sh/setup-uv from 5 to 6 URL: https://github.com/apache/datafusion-python/pull/1090 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] feat: Add experimental auto mode for `COMET_PARQUET_SCAN_IMPL` [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on code in PR #1747: URL: https://github.com/apache/datafusion-comet/pull/1747#discussion_r2145468079 ## spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala: ## @@ -105,8 +105,49 @@ case class CometScanRule(session: SparkSession) extends Rule[Spa

Re: [PR] build(deps): bump pyo3 from 0.24.1 to 0.24.2 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1114: URL: https://github.com/apache/datafusion-python/pull/1114#issuecomment-2970893819 @dependabot ignore this major version -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] build(deps): bump pyo3 from 0.24.1 to 0.24.2 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1114: build(deps): bump pyo3 from 0.24.1 to 0.24.2 URL: https://github.com/apache/datafusion-python/pull/1114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] build(deps): bump pyo3 from 0.23.4 to 0.24.1 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1109: URL: https://github.com/apache/datafusion-python/pull/1109#issuecomment-2970895644 @dependabot ignore this major version -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] build(deps): bump pyo3 from 0.24.1 to 0.24.2 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1114: URL: https://github.com/apache/datafusion-python/pull/1114#issuecomment-2970893094 Closing because our pyo3 upgrade is controlled by upstream dependency requirements. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] build(deps): bump pyo3-build-config from 0.24.1 to 0.25.0 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1129: URL: https://github.com/apache/datafusion-python/pull/1129#issuecomment-2970896425 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] build(deps): bump pyo3 from 0.24.1 to 0.24.2 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1114: URL: https://github.com/apache/datafusion-python/pull/1114#issuecomment-2970893214 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] build(deps): bump pyo3 from 0.23.4 to 0.24.1 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1109: build(deps): bump pyo3 from 0.23.4 to 0.24.1 in /examples/ffi-table-provider URL: https://github.com/apache/datafusion-python/pull/1109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] build(deps): bump pyo3 from 0.23.4 to 0.24.1 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1109: URL: https://github.com/apache/datafusion-python/pull/1109#issuecomment-2970895087 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] build(deps): bump pyo3-build-config from 0.24.1 to 0.25.0 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1129: URL: https://github.com/apache/datafusion-python/pull/1129#issuecomment-2970896750 @dependabot ignore this major version -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] build(deps): bump pyo3-build-config from 0.24.1 to 0.25.0 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1129: URL: https://github.com/apache/datafusion-python/pull/1129#issuecomment-2970896322 Closing because our pyo3 upgrade is controlled by upstream dependency requirements. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] build(deps): bump pyo3 from 0.23.4 to 0.24.1 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1109: URL: https://github.com/apache/datafusion-python/pull/1109#issuecomment-2970894994 Closing because our pyo3 upgrade is controlled by upstream dependency requirements. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] build(deps): bump pyo3-build-config from 0.24.1 to 0.25.0 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer closed pull request #1129: build(deps): bump pyo3-build-config from 0.24.1 to 0.25.0 URL: https://github.com/apache/datafusion-python/pull/1129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] build(deps): bump pyo3-build-config from 0.24.1 to 0.25.0 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1129: URL: https://github.com/apache/datafusion-python/pull/1129#issuecomment-2970896867 OK, I won't notify you about version 0.x.x again, unless you re-open this PR. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] build(deps): bump pyo3 from 0.23.4 to 0.24.1 in /examples/ffi-table-provider [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1109: URL: https://github.com/apache/datafusion-python/pull/1109#issuecomment-2970895746 OK, I won't notify you about version 0.x.x again, unless you re-open this PR. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] feat: Support Parquet writer options [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on code in PR #1123: URL: https://github.com/apache/datafusion-python/pull/1123#discussion_r2145455758 ## python/datafusion/dataframe.py: ## @@ -704,38 +694,135 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def wri

Re: [PR] feat: Add experimental auto mode for `COMET_PARQUET_SCAN_IMPL` [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove commented on code in PR #1747: URL: https://github.com/apache/datafusion-comet/pull/1747#discussion_r2145468079 ## spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala: ## @@ -105,8 +105,49 @@ case class CometScanRule(session: SparkSession) extends Rule[Spa

Re: [I] Improved experience when remote object store URL does not end in `/` [datafusion]

2025-06-13 Thread via GitHub
alamb commented on issue #16302: URL: https://github.com/apache/datafusion/issues/16302#issuecomment-2970886506 Thanks @xiedeyantu -- I'll try and review it shortly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] This Week in Comet (Jan 26) [datafusion-comet]

2025-06-13 Thread via GitHub
andygrove closed issue #1342: This Week in Comet (Jan 26) URL: https://github.com/apache/datafusion-comet/issues/1342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [PR] build(deps): bump pyo3 from 0.24.1 to 0.24.2 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1114: URL: https://github.com/apache/datafusion-python/pull/1114#issuecomment-2970893918 OK, I won't notify you about version 0.x.x again, unless you re-open this PR. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] build(deps): bump astral-sh/setup-uv from 5 to 6 [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1090: URL: https://github.com/apache/datafusion-python/pull/1090#issuecomment-2970901160 Closing because this is already updated in `main` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2970903681 > And obviously we expect Q23 to get a huge bump once we enable filter pushdown... > > I think the only thing left here is for someone to review in detail 😄 I will do

Re: [PR] build(deps): bump astral-sh/setup-uv from 5 to 6 [datafusion-python]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #1090: URL: https://github.com/apache/datafusion-python/pull/1090#issuecomment-2970901234 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor v

Re: [PR] refactor: collect dataframe as stream in `__repr__` [datafusion-python]

2025-06-13 Thread via GitHub
timsaucer commented on PR #1015: URL: https://github.com/apache/datafusion-python/pull/1015#issuecomment-2970821413 Original problem was resolved by https://github.com/apache/datafusion-python/pull/1036 which I believe used part of this work. Thank you very much for the contribution! --

Re: [PR] feat: Add experimental auto mode for `COMET_PARQUET_SCAN_IMPL` [datafusion-comet]

2025-06-13 Thread via GitHub
parthchandra commented on code in PR #1747: URL: https://github.com/apache/datafusion-comet/pull/1747#discussion_r2145482367 ## spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala: ## @@ -105,8 +105,49 @@ case class CometScanRule(session: SparkSession) extends Rule[

Re: [I] Enable `Comet native metrics: scan` test for `native_iceberg_compat` [datafusion-comet]

2025-06-13 Thread via GitHub
parthchandra commented on issue #1882: URL: https://github.com/apache/datafusion-comet/issues/1882#issuecomment-2970909374 Metrics aren't really implemented correctly for `native_iceberg_compat` at the moment so enabling the test won't help. The metrics do need to be corrected. -- This i

Re: [I] [DISCUSSION] JOIN "task force" / project team [datafusion]

2025-06-13 Thread via GitHub
alamb commented on issue #15885: URL: https://github.com/apache/datafusion/issues/15885#issuecomment-2970916647 I have seen @irenjj / @duongcongtoai / @UBarney / @jonathanc-n working on this @xudong963 (a DataFusion committer and PMC member) has experience in this area and may have

Re: [PR] Add fast paths for try_process_unnest [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #16389: URL: https://github.com/apache/datafusion/pull/16389#issuecomment-2971026040 > logical_select_all_from_1000 10.80 120.4±0.22ms ? ?/sec1.00 11.1±0.06ms? ?/sec 🚀 The other planning benchmarks look like prett

Re: [PR] Add fast paths for try_process_unnest [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #16389: URL: https://github.com/apache/datafusion/pull/16389#discussion_r2145582029 ## datafusion/sql/src/select.rs: ## @@ -374,6 +383,14 @@ impl SqlToRel<'_, S> { fn try_process_aggregate_unnest(&self, input: LogicalPlan) -> Result {

Re: [PR] chore(deps): bump prost-build from 0.13.5 to 0.14.0 in the proto group [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #16392: URL: https://github.com/apache/datafusion/pull/16392#issuecomment-2971066685 This needs to wait for an arrow-rs update -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Simplify expressions passed to table functions [datafusion]

2025-06-13 Thread via GitHub
alamb commented on code in PR #16388: URL: https://github.com/apache/datafusion/pull/16388#discussion_r2145600999 ## datafusion/core/src/execution/session_state.rs: ## @@ -1675,6 +1675,13 @@ impl ContextProvider for SessionContextProvider<'_> { .get(name)

Re: [PR] Optimize Hex Function [datafusion]

2025-06-13 Thread via GitHub
alamb commented on PR #16077: URL: https://github.com/apache/datafusion/pull/16077#issuecomment-2971062970 Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is read

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
Dandandan commented on code in PR #15770: URL: https://github.com/apache/datafusion/pull/15770#discussion_r2145572928 ## datafusion/physical-plan/src/sorts/sort.rs: ## @@ -843,6 +846,8 @@ pub struct SortExec { common_sort_prefix: Vec, /// Cache holding plan properties

Re: [PR] TopK dynamic filter pushdown attempt 2 [datafusion]

2025-06-13 Thread via GitHub
Dandandan commented on PR #15770: URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2971038192 I wonder if we see any improvements on the "sort tpch with limit" benchmark? ```cargo run --release --bin dfbench -- sort-tpch --iterations 5 --path "${TPCH_DIR}" -o "${RESUL

Re: [PR] chore(deps): bump prost-build from 0.13.5 to 0.14.0 in the proto group [datafusion]

2025-06-13 Thread via GitHub
alamb closed pull request #16392: chore(deps): bump prost-build from 0.13.5 to 0.14.0 in the proto group URL: https://github.com/apache/datafusion/pull/16392 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-13 Thread via GitHub
alamb closed issue #16323: Request to update crates.io ownership URL: https://github.com/apache/datafusion/issues/16323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-13 Thread via GitHub
alamb commented on issue #16323: URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2971079530 I think we are now done with this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] chore(deps): bump prost-build from 0.13.5 to 0.14.0 in the proto group [datafusion]

2025-06-13 Thread via GitHub
dependabot[bot] commented on PR #16392: URL: https://github.com/apache/datafusion/pull/16392#issuecomment-2971066993 This pull request was built based on a group rule. Closing it will not ignore any of these versions in future pull requests. To ignore these dependencies, configure [ig

  1   2   3   >