[
https://issues.apache.org/jira/browse/HADOOP-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17932857#comment-17932857
]
ASF GitHub Bot commented on HADOOP-19474:
-----------------------------------------
anmolanmol1234 commented on code in PR #7421:
URL: https://github.com/apache/hadoop/pull/7421#discussion_r1982780815
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java:
##########
@@ -1583,26 +1590,40 @@ public Hashtable<String, String>
getXMSProperties(AbfsHttpOperation result)
/**
* Parse the XML response body returned by ListBlob API on Blob Endpoint.
- * @param stream InputStream contains the response from server.
- * @return BlobListResultSchema containing the list of entries.
- * @throws IOException if parsing fails.
+ * @param result InputStream contains the response from server.
+ * @param uri to be used for path conversion.
+ * @return {@link ListResponseData}. containing listing response.
+ * @throws AzureBlobFileSystemException if parsing fails.
*/
@Override
- public ListResultSchema parseListPathResults(final InputStream stream)
throws IOException {
- if (stream == null) {
- return null;
- }
+ public ListResponseData parseListPathResults(AbfsHttpOperation result, URI
uri)
+ throws AzureBlobFileSystemException {
BlobListResultSchema listResultSchema;
- try {
- final SAXParser saxParser = saxParserThreadLocal.get();
- saxParser.reset();
- listResultSchema = new BlobListResultSchema();
- saxParser.parse(stream, new BlobListXmlParser(listResultSchema,
getBaseUrl().toString()));
- } catch (SAXException | IOException e) {
- throw new RuntimeException(e);
+ try (InputStream stream = result.getListResultStream()) {
+ if (stream == null) {
+ return null;
+ }
+ try {
+ final SAXParser saxParser = saxParserThreadLocal.get();
+ saxParser.reset();
+ listResultSchema = new BlobListResultSchema();
+ saxParser.parse(stream,
+ new BlobListXmlParser(listResultSchema, getBaseUrl().toString()));
+ result.setListResultSchema(listResultSchema);
+ } catch (SAXException | IOException e) {
+ throw new AbfsDriverException(e);
+ }
+ } catch (IOException e) {
+ LOG.error("Unable to deserialize list results", e);
Review Comment:
Given we have the uri now, should we include that in the error log as well ?
> ABFS: [FnsOverBlob] Listing Optimizations to avoid multiple iteration over
> list response.
> -----------------------------------------------------------------------------------------
>
> Key: HADOOP-19474
> URL: https://issues.apache.org/jira/browse/HADOOP-19474
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.5.0, 3.4.1
> Reporter: Anuj Modi
> Assignee: Anuj Modi
> Priority: Major
> Labels: pull-request-available
>
> On blob endpoint, there are a couple of handling that is needed to be done on
> client side.
> This involves:
> # Parsing of xml response and converting them to VersionedFileStatus list
> # Removing duplicate entries for non-empty explicit directories coming due
> to presence of the marker files
> # Trigerring Rename recovery on the previously failed rename indicated by
> the presence of pending json file.
> Currently all three are done in a separate iteration over whole list. This is
> to pbring all those things to a common place so that single iteration over
> list reposne can handle all three.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]