[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2023-05-22 Thread Yue Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yue Zhang updated HUDI-4944:

Fix Version/s: 0.14.0
   (was: 0.13.1)

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.0
>
> Attachments: Untitled
>
>
> When the source partitioned parquet table of the bootstrap operation has the 
> encoded slash (%2F) in the partition path, e.g., 
> "partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
> bootstrap indexing storing the data file path containing the partition path 
> with the encoded slash (%2F), the target bootstrapped Hudi table cannot be 
> read due to FileNotFound exception.  The root cause is that the encoding of 
> the slash is lost when creating the new Path instance with the URI (see 
> below, that "partition_path=2015/03/17" instead of 
> "partition_path=2015%2F03%2F17").
> {code:java}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
>     at 
> org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
>     at 
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:448)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(Spark24HoodieParquetFileFormat.scala:131)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$1(Spark24HoodieParquetFileFormat.scala:130)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:134)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:111)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:71)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:70)
>     at org.apache.hudi.HoodieBootstrapRDD.compute(HoodieBootstrapRDD.scala:60)
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) {code}
> The path conversion that causes the problem is in the code below.  "new 
> URI(file.filePath)" decodes the "%2F" and converts the slash.
> Spark24HoodieParquetFileFormat (same for Spark32PlusHoodieParquetFileFormat)
> {code:java}
> val fileSplit =
>   new FileSplit(new Path(new URI(file.filePath)), file.start, file.length, 
> Array.empty) {code}
> This fails the tests below and we need to use a partition path without 
> slashes in the value for now: 
> TestHoodieDeltaStreamer#testBulkInsertsAndUpsertsWithBootstrap
> ITTestHoodieDemo#testParquetDemo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2023-05-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-4944:
-
Labels: pull-request-available  (was: )

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.13.1
>
> Attachments: Untitled
>
>
> When the source partitioned parquet table of the bootstrap operation has the 
> encoded slash (%2F) in the partition path, e.g., 
> "partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
> bootstrap indexing storing the data file path containing the partition path 
> with the encoded slash (%2F), the target bootstrapped Hudi table cannot be 
> read due to FileNotFound exception.  The root cause is that the encoding of 
> the slash is lost when creating the new Path instance with the URI (see 
> below, that "partition_path=2015/03/17" instead of 
> "partition_path=2015%2F03%2F17").
> {code:java}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
>     at 
> org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
>     at 
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:448)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(Spark24HoodieParquetFileFormat.scala:131)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$1(Spark24HoodieParquetFileFormat.scala:130)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:134)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:111)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:71)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:70)
>     at org.apache.hudi.HoodieBootstrapRDD.compute(HoodieBootstrapRDD.scala:60)
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) {code}
> The path conversion that causes the problem is in the code below.  "new 
> URI(file.filePath)" decodes the "%2F" and converts the slash.
> Spark24HoodieParquetFileFormat (same for Spark32PlusHoodieParquetFileFormat)
> {code:java}
> val fileSplit =
>   new FileSplit(new Path(new URI(file.filePath)), file.start, file.length, 
> Array.empty) {code}
> This fails the tests below and we need to use a partition path without 
> slashes in the value for now: 
> TestHoodieDeltaStreamer#testBulkInsertsAndUpsertsWithBootstrap
> ITTestHoodieDemo#testParquetDemo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2023-05-02 Thread Jonathan Vexler (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Vexler updated HUDI-4944:
--
Attachment: Untitled

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.1
>
> Attachments: Untitled
>
>
> When the source partitioned parquet table of the bootstrap operation has the 
> encoded slash (%2F) in the partition path, e.g., 
> "partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
> bootstrap indexing storing the data file path containing the partition path 
> with the encoded slash (%2F), the target bootstrapped Hudi table cannot be 
> read due to FileNotFound exception.  The root cause is that the encoding of 
> the slash is lost when creating the new Path instance with the URI (see 
> below, that "partition_path=2015/03/17" instead of 
> "partition_path=2015%2F03%2F17").
> {code:java}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
>     at 
> org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
>     at 
> org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:448)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(Spark24HoodieParquetFileFormat.scala:131)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$1(Spark24HoodieParquetFileFormat.scala:130)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:134)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:111)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:71)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:70)
>     at org.apache.hudi.HoodieBootstrapRDD.compute(HoodieBootstrapRDD.scala:60)
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) {code}
> The path conversion that causes the problem is in the code below.  "new 
> URI(file.filePath)" decodes the "%2F" and converts the slash.
> Spark24HoodieParquetFileFormat (same for Spark32PlusHoodieParquetFileFormat)
> {code:java}
> val fileSplit =
>   new FileSplit(new Path(new URI(file.filePath)), file.start, file.length, 
> Array.empty) {code}
> This fails the tests below and we need to use a partition path without 
> slashes in the value for now: 
> TestHoodieDeltaStreamer#testBulkInsertsAndUpsertsWithBootstrap
> ITTestHoodieDemo#testParquetDemo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Description: 
When the source partitioned parquet table of the bootstrap operation has the 
encoded slash (%2F) in the partition path, e.g., 
"partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
bootstrap indexing storing the data file path containing the partition path 
with the encoded slash (%2F), the target bootstrapped Hudi table cannot be read 
due to FileNotFound exception.  The root cause is that the encoding of the 
slash is lost when creating the new Path instance with the URI (see below, that 
"partition_path=2015/03/17" instead of "partition_path=2015%2F03%2F17").
{code:java}
Caused by: java.io.FileNotFoundException: File does not exist: 
hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
    at 
org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
    at 
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:448)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(Spark24HoodieParquetFileFormat.scala:131)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$1(Spark24HoodieParquetFileFormat.scala:130)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:134)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:111)
    at 
org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:71)
    at 
org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:70)
    at org.apache.hudi.HoodieBootstrapRDD.compute(HoodieBootstrapRDD.scala:60)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) {code}
The path conversion that causes the problem is in the code below.  "new 
URI(file.filePath)" decodes the "%2F" and converts the slash.

Spark24HoodieParquetFileFormat (same for Spark32PlusHoodieParquetFileFormat)
{code:java}
val fileSplit =
  new FileSplit(new Path(new URI(file.filePath)), file.start, file.length, 
Array.empty) {code}
This fails the tests below and we need to use a partition path without slashes 
in the value for now: 

TestHoodieDeltaStreamer#testBulkInsertsAndUpsertsWithBootstrap

ITTestHoodieDemo#testParquetDemo

  was:
When the source partitioned parquet table of the bootstrap operation has the 
encoded slash (%2F) in the partition path, e.g., 
"partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
bootstrap indexing storing the data file path containing the partition path 
with the encoded slash (%2F), the target bootstrapped Hudi table cannot be read 
due to FileNotFound exception.  The root cause is that the encoding of the 
slash is lost when creating the new Path instance with the URI (see below, that 
"partition_path=2015/03/17" instead of "partition_path=2015%2F03%2F17").
{code:java}
Caused by: java.io.FileNotFoundException: File does not exist: 
hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
    at 
org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
    at 
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:448)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(Spark24HoodieParquetFileFormat.scala:131)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buil

[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Description: 
When the source partitioned parquet table of the bootstrap operation has the 
encoded slash (%2F) in the partition path, e.g., 
"partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
bootstrap indexing storing the data file path containing the partition path 
with the encoded slash (%2F), the target bootstrapped Hudi table cannot be read 
due to FileNotFound exception.  The root cause is that the encoding of the 
slash is lost when creating the new Path instance with the URI (see below, that 
"partition_path=2015/03/17" instead of "partition_path=2015%2F03%2F17").
{code:java}
Caused by: java.io.FileNotFoundException: File does not exist: 
hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
    at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
    at 
org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
    at 
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:448)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$lzycompute$1(Spark24HoodieParquetFileFormat.scala:131)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.footerFileMetaData$1(Spark24HoodieParquetFileFormat.scala:130)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:134)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark24HoodieParquetFileFormat$$anonfun$buildReaderWithPartitionValues$1.apply(Spark24HoodieParquetFileFormat.scala:111)
    at 
org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:71)
    at 
org.apache.hudi.HoodieDataSourceHelper$$anonfun$buildHoodieParquetReader$1.apply(HoodieDataSourceHelper.scala:70)
    at org.apache.hudi.HoodieBootstrapRDD.compute(HoodieBootstrapRDD.scala:60)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) {code}
The path conversion that causes the problem is in the code below.  "new 
URI(file.filePath)" decodes the "%2F" and converts the slash.

Spark24HoodieParquetFileFormat (same for Spark32PlusHoodieParquetFileFormat)
{code:java}
val fileSplit =
  new FileSplit(new Path(new URI(file.filePath)), file.start, file.length, 
Array.empty) {code}

  was:When the source partitioned parquet table of the bootstrap operation has 
the 


> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>
> When the source partitioned parquet table of the bootstrap operation has the 
> encoded slash (%2F) in the partition path, e.g., 
> "partition_path=2015%2F03%2F17", after the metadata-only bootstrap with the 
> bootstrap indexing storing the data file path containing the partition path 
> with the encoded slash (%2F), the target bootstrapped Hudi table cannot be 
> read due to FileNotFound exception.  The root cause is that the encoding of 
> the slash is lost when creating the new Path instance with the URI (see 
> below, that "partition_path=2015/03/17" instead of 
> "partition_path=2015%2F03%2F17").
> {code:java}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://localhost:62738/user/ethan/test_dataset_bootstrapped/partition_path=2015/03/17/e0fa3466-d3bc-43f7-b586-2f95d8745095_3-161-675_01.parquet
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1528)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1521)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1521)
>     at 
> org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
>   

[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Description: When the source partitioned parquet table of the bootstrap 
operation has the   (was: When the source)

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>
> When the source partitioned parquet table of the bootstrap operation has the 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Description: When the source

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>
> When the source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Component/s: bootstrap

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: bootstrap
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>
> When the source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Summary: The encoded slash (%2F) in partition path is not properly decoded 
during Spark read  (was: The encoded slash (%2F) in partition path is not 
properly decoded)

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4944) The encoded slash (%2F) in partition path is not properly decoded during Spark read

2022-09-28 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-4944:

Fix Version/s: 0.13.0

> The encoded slash (%2F) in partition path is not properly decoded during 
> Spark read
> ---
>
> Key: HUDI-4944
> URL: https://issues.apache.org/jira/browse/HUDI-4944
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)