[ 
https://issues.apache.org/jira/browse/HUDI-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo closed HUDI-5545.
---------------------------
    Resolution: Fixed

> Extending support to other special characters for S3EventsMetaSelector
> ----------------------------------------------------------------------
>
>                 Key: HUDI-5545
>                 URL: https://issues.apache.org/jira/browse/HUDI-5545
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Ethan Guo
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>
> This fix is to cover issue as follows.
> I am working on ingestion with S3 as source by following this 
> [blog|https://hudi.apache.org/blog/2021/08/23/s3-events-source/] . But 2nd 
> job(S3EventsHoodieIncrSource) failing with
> {{{}HoodieException: org.apache.hudi.exception.HoodieException: Path does not 
> exist{}}}. In our investigation, we have observed job failing due to encoded 
> characters( these are being added by SQS) in S3 object name.
> When we deep dive in Hudi source code , we have observed Hudi decoding them 
> in 
> [S3EventsMetaSelector|https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/S3EventsMetaSelector.java#L154]
>  & at the movement only = have handled.
> FYI-
> Original S3 object : 
> {{s3://<bucket>/s3_parquet_source_data/s3-test+0+0000061344.parquet}}
> Encoded S3 object: 
> {{s3://<bucket>/s3_parquet_source_data/s3-test%2B0%2B0000061344.parquet}}
> Note: workflow was running successfully if file name corrected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to