Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
tustvold commented on PR #376: URL: https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-4201403770 This PR was closed due to inactivity, the issue is still open -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
TomNicholas commented on PR #376: URL: https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-4200878647 hi @tustvold - why was this closed? As far as I can tell its still an unresolved issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
tustvold closed pull request #376: Add option to Azure client to ignore unparsable paths URL: https://github.com/apache/arrow-rs-object-store/pull/376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
crepererum commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2150396648
##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix:
Option<&str>) -> Result prefix.len()
})
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),
Review Comment:
> though now that we've released 0.12.2 we can now merge breaking changes,
right?
right :+1:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
kylebarron commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2150352892
##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix:
Option<&str>) -> Result prefix.len()
})
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),
Review Comment:
In
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2927448265
@ttomasz said returning an error might require a breaking change, though now
that we've released 0.12.2 we can now merge breaking changes, right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
kylebarron commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2150337728
##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix:
Option<&str>) -> Result prefix.len()
})
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),
Review Comment:
Oh sorry I was just agreeing with you that it's a good question 😅. I don't
recall what the current behavior is
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
crepererum commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2149310536
##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix:
Option<&str>) -> Result prefix.len()
})
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),
Review Comment:
I read @kylebarron's :+1: on
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2966380560
as "current behavior is there's an error, NOT a panic". So I think the flag
should ignore the error, but even if the flag isn't set, you MUST NOT panic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
kylebarron commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2145334055
##
src/azure/builder.rs:
##
@@ -180,6 +180,8 @@ pub struct MicrosoftAzureBuilder {
fabric_cluster_identifier: Option,
/// The [`HttpConnector`] to use
http_connector: Option>,
+/// When listing objects, ignore paths that throw and error when parsing
Review Comment:
```suggestion
/// When listing objects, ignore paths that throw an error when parsing
```
Might also be good to say "See [`AzureConfigKey::IgnoreUnparseablePaths`]
for more details
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
crepererum commented on PR #376: URL: https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2966380560 What's the current behavior on `object_store`? I suppose it returns an error if the path is unparsable, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
ttomasz commented on PR #376: URL: https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2927448265 > Apart from the notes I've left in-line, I'm wondering if we should even panic or should rather return an error. This would be in line with all kinds of unexpected server answers that we handle so far (e.g. weird status codes, broken bodies, TLS failures). The flag would then toggle between "error returned" and "error ignored" instead of "panic" and "no panic". WDYT? Thanks for the review. In general it would probably be better to pass the error for user to handle instead of throwing exception/panicking (or as I mentioned in the linked issue there could be a separate interface to get paths as they come i.e. strings instead of parsing them into Path objects). Then the flag wouldn't even be needed. But I don't know how that could be implemented without backwards incompatible changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
ttomasz commented on code in PR #376: URL: https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2119269548 ## src/azure/client.rs: ## @@ -1058,6 +1063,9 @@ fn to_list_result(value: ListResultInternal, prefix: Option<&str>) -> Result prefix.len() }) +.map(BlobInternal::try_from) +.filter(|parsed| !ignore_unparsable_paths || parsed.is_ok()) +.map(|parsed| parsed.unwrap()) Review Comment: Added suggested code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
ttomasz commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2119268885
##
src/azure/client.rs:
##
@@ -1540,4 +1565,135 @@ Time:2018-06-14T16:46:54.6040685Z\r
assert_eq!("404", code);
assert_eq!("The specified blob does not exist.", reason);
}
+
+#[tokio::test]
+async fn test_list_blobs() {
+let fake_properties = BlobProperties {
+last_modified: Utc::now(),
+content_length: 8,
+content_type: "text/plain".to_string(),
+content_encoding: None,
+content_language: None,
+e_tag: Some("etag".to_string()),
+resource_type: Some("resource".to_string()),
+};
+let fake_result = ListResultInternal {
+prefix: None,
+max_results: None,
+delimiter: None,
+next_marker: None,
+blobs: Blobs {
+blob_prefix: vec![],
+blobs: vec![
+Blob {
+name: "blob0.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+Blob {
+name: "blob1.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+],
+},
+};
+let result = to_list_result(fake_result, None, false).unwrap();
+assert_eq!(result.common_prefixes.len(), 0);
+assert_eq!(result.objects.len(), 2);
+assert_eq!(result.objects[0].location, Path::from("blob0.txt"));
+assert_eq!(result.objects[1].location, Path::from("blob1.txt"));
+}
+
+#[tokio::test]
+#[should_panic]
Review Comment:
Added expected error message
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]
crepererum commented on code in PR #376:
URL:
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2108835862
##
src/azure/client.rs:
##
@@ -1540,4 +1565,135 @@ Time:2018-06-14T16:46:54.6040685Z\r
assert_eq!("404", code);
assert_eq!("The specified blob does not exist.", reason);
}
+
+#[tokio::test]
+async fn test_list_blobs() {
+let fake_properties = BlobProperties {
+last_modified: Utc::now(),
+content_length: 8,
+content_type: "text/plain".to_string(),
+content_encoding: None,
+content_language: None,
+e_tag: Some("etag".to_string()),
+resource_type: Some("resource".to_string()),
+};
+let fake_result = ListResultInternal {
+prefix: None,
+max_results: None,
+delimiter: None,
+next_marker: None,
+blobs: Blobs {
+blob_prefix: vec![],
+blobs: vec![
+Blob {
+name: "blob0.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+Blob {
+name: "blob1.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+],
+},
+};
+let result = to_list_result(fake_result, None, false).unwrap();
+assert_eq!(result.common_prefixes.len(), 0);
+assert_eq!(result.objects.len(), 2);
+assert_eq!(result.objects[0].location, Path::from("blob0.txt"));
+assert_eq!(result.objects[1].location, Path::from("blob1.txt"));
+}
+
+#[tokio::test]
+#[should_panic]
Review Comment:
```suggestion
#[should_panic(expected = "???")]
```
I find it helpful to check the panic message (or a substring of it) to
ensure it's providing some useful user error.
##
src/azure/client.rs:
##
@@ -1058,6 +1063,9 @@ fn to_list_result(value: ListResultInternal, prefix:
Option<&str>) -> Result prefix.len()
})
+.map(BlobInternal::try_from)
+.filter(|parsed| !ignore_unparsable_paths || parsed.is_ok())
+.map(|parsed| parsed.unwrap())
Review Comment:
I think instead of rolling this into two iterator steps, this would be
easier to reason about:
```suggestion
.filter_map(|parsed| {
match parsed {
Ok(parsed) => Some(parsed),
Err(_) if ignore_unparsable_paths => None,
Err(e) => panic!("cannot parse path: {e}"),
}
})
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
