[ 
https://issues.apache.org/jira/browse/JCLOUDS-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860666#comment-17860666
 ] 

Jacob Nguyen edited comment on JCLOUDS-1638 at 6/28/24 12:07 AM:
-----------------------------------------------------------------

Extra thing to note, S3 SDK now uses encoding-type = url by default in their 
newer versions now.
Not sure whether JClouds should do the same thing.

https://github.com/aws/aws-sdk-java/issues/333#issuecomment-213096411


was (Author: JIRAUSER306001):
Extra thing to note, S3 SDK now uses encoding-type = url by default in their 
newer versions now.
Not sure whether JClouds should do the same thing.

https://github.com/aws/aws-sdk-java/issues/460#issuecomment-240296956

> SAXParseException on S3 Listing
> -------------------------------
>
>                 Key: JCLOUDS-1638
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-1638
>             Project: jclouds
>          Issue Type: Bug
>    Affects Versions: 2.5.0, 2.6.0
>            Reporter: Jacob Nguyen
>            Assignee: Andrew Gaul
>            Priority: Major
>
> {noformat}
> java.lang.RuntimeException: request: GET 
> https://cloudsync-performance-tests.s3.amazonaws.com/?delimiter=/&prefix=some/&max-keys=1000
>  HTTP/1.1; response: HTTP/1.1 200 OK; cause: java.lang.RuntimeException: 
> request: GET 
> https://cloudsync-performance-tests.s3.amazonaws.com/?delimiter=/&prefix=some/&max-keys=1000
>  HTTP/1.1; error at 586:2 in document ; cause: org.xml.sax.SAXParseException; 
> lineNumber: 2; columnNumber: 586; Character reference "&#x10" is an invalid 
> XML character.
>       at 
> org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:174)
>       at 
> org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:146)
>       at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:86)
>       at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:52)
>       at 
> org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
>       at 
> org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
>       at 
> org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
>       at 
> org.jclouds.rest.internal.DelegatesToInvocationFunction.handle(DelegatesToInvocationFunction.java:156)
>       at 
> org.jclouds.rest.internal.DelegatesToInvocationFunction.invoke(DelegatesToInvocationFunction.java:123)
>       at jdk.proxy2/jdk.proxy2.$Proxy235.listBucket(Unknown Source)
>       at org.jclouds.s3.blobstore.S3BlobStore.list(S3BlobStore.java:177)
> {noformat}
> When there's a control character in the folder path in S3, we can't parse it 
> from the response because it throws SAXParseException.
> Can there be an option that at least lets us forward the encoding-type param?
> https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html#API_ListObjects_RequestSyntax
> And url decode it for us so that listing can be possible? This bug currently 
> doesn't allow us to list any children of a root folder if one of the children 
> contains control characters.
> Here's an example XML response from S3 when listing objects from cURL:
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?>
> <ListBucketResult 
> xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test&#x10;/</Prefix></CommonPrefixes></ListBucketResult>
> {noformat}
> Child folder of 'some' contains 
> {noformat}
> <Prefix>some/test&#x10;/</Prefix>
> {noformat}
> which can't be parsed.
> But with the urlParam &encoding-type=url :
> {noformat}
> <?xml version="1.0" encoding="UTF-8"?>
> <ListBucketResult 
> xmlns="http://s3.amazonaws.com/doc/2006-03-01/";><Name>cloudsync-performance-tests</Name><Prefix>some/</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>/</Delimiter><EncodingType>url</EncodingType><IsTruncated>false</IsTruncated><CommonPrefixes><Prefix>some/test%10/</Prefix></CommonPrefixes></ListBucketResult>
> {noformat}
> {noformat}
> <Prefix>some/test%10/</Prefix>
> {noformat}
> Can probably be parsed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to