[PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-20 Thread via GitHub
tigrulya-exe opened a new pull request, #43463: URL: https://github.com/apache/spark/pull/43463 ### What changes were proposed in this pull request? In current version `DataSource#checkAndGlobPathIfNecessary` qualifies paths via `Path#makeQualified` and `PartitioningAw

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-20 Thread via GitHub
beliefer commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1366851439 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala: ## @@ -25,6 +27,7 @@ import org.apache.spark.sql.AnalysisException import

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-20 Thread via GitHub
tigrulya-exe commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1366881061 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala: ## @@ -197,11 +233,35 @@ object TestPaths { ) ) + val txt

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-20 Thread via GitHub
tigrulya-exe commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1366881061 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala: ## @@ -197,11 +233,35 @@ object TestPaths { ) ) + val txt

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-23 Thread via GitHub
beliefer commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1775070594 @tigrulya-exe Please re-trigger GA tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-24 Thread via GitHub
tigrulya-exe commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1776753389 @beliefer I re-ran tests several times, but they failed either due to lack of resources or due to flaky ProtobufCatalystDataConversionSuite. I will rebase on #43493 after it will be

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-24 Thread via GitHub
beliefer commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1778611179 Please fix the conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-10-30 Thread via GitHub
tigrulya-exe commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1784591198 @beliefer Hi! I fixed the conflicts and rebased on master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-11-07 Thread via GitHub
tigrulya-exe commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1798443089 @cloud-fan Hi! Could you take a look please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2023-11-28 Thread via GitHub
tigrulya-exe commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1829430527 @cloud-fan Hi! Could you take a look please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-01-19 Thread via GitHub
tigrulya-exe commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1900078190 @cloud-fan Hi! I've rebased on master and fixed conflicts. Could you please take a look? -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-01 Thread via GitHub
cloud-fan commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1921563767 The fix is straightforward but the test is convoluted. How do you test `HarFileSystem`? I can't find any code setting the file system, but only the directory name contains `har`. --

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-06 Thread via GitHub
tigrulya-exe commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1929611462 @cloud-fan we construct absolute file paths with `har://` scheme in the [DataSourceSuite#buildFullHarPaths](https://github.com/apache/spark/blob/a005f6526eaf212cb5b60b56d673751037

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-06 Thread via GitHub
cloud-fan commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1479914443 ## sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala: ## @@ -1363,4 +1363,19 @@ class DataFrameReaderWriterSuite extends QueryTest wi

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-06 Thread via GitHub
cloud-fan commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1479915986 ## sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala: ## @@ -1363,4 +1363,19 @@ class DataFrameReaderWriterSuite extends QueryTest wi

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-07 Thread via GitHub
tigrulya-exe commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1481571956 ## sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala: ## @@ -1363,4 +1363,19 @@ class DataFrameReaderWriterSuite extends QueryTest

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-07 Thread via GitHub
tigrulya-exe commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1481586528 ## sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala: ## @@ -1363,4 +1363,19 @@ class DataFrameReaderWriterSuite extends QueryTest

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-07 Thread via GitHub
cloud-fan commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1482325524 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala: ## @@ -214,4 +216,6 @@ class MockFileSystem extends RawLocalFileSystem {

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-07 Thread via GitHub
cloud-fan commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1482326766 ## sql/core/src/test/resources/test-data/test-archive.har/_index: ## @@ -0,0 +1,7 @@ +%2F dir 1697722622766+493+tigrulya+hadoop 0 0 test.txt test.orc test.parquet te

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-08 Thread via GitHub
tigrulya-exe commented on code in PR #43463: URL: https://github.com/apache/spark/pull/43463#discussion_r1482569087 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala: ## @@ -214,4 +216,6 @@ class MockFileSystem extends RawLocalFileSystem

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-08 Thread via GitHub
cloud-fan commented on PR #43463: URL: https://github.com/apache/spark/pull/43463#issuecomment-1934013053 thanks, merging to master/3.5! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing [spark]

2024-02-08 Thread via GitHub
cloud-fan closed pull request #43463: [SPARK-39910][SQL] Delegate path qualification to filesystem during DataSource file path globbing URL: https://github.com/apache/spark/pull/43463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH