Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2026-04-07 Thread via GitHub


tustvold commented on PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-4201403770

   This PR was closed due to inactivity, the issue is still open


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2026-04-07 Thread via GitHub


TomNicholas commented on PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-4200878647

   hi @tustvold - why was this closed? As far as I can tell its still an 
unresolved issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2026-01-16 Thread via GitHub


tustvold closed pull request #376: Add option to Azure client to ignore 
unparsable paths
URL: https://github.com/apache/arrow-rs-object-store/pull/376


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-16 Thread via GitHub


crepererum commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2150396648


##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix: 
Option<&str>) -> Result prefix.len()
 })
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),

Review Comment:
   > though now that we've released 0.12.2 we can now merge breaking changes, 
right?
   
   right :+1: 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-16 Thread via GitHub


kylebarron commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2150352892


##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix: 
Option<&str>) -> Result prefix.len()
 })
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),

Review Comment:
   In 
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2927448265
 @ttomasz said returning an error might require a breaking change, though now 
that we've released 0.12.2 we can now merge breaking changes, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-16 Thread via GitHub


kylebarron commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2150337728


##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix: 
Option<&str>) -> Result prefix.len()
 })
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),

Review Comment:
   Oh sorry I was just agreeing with you that it's a good question 😅. I don't 
recall what the current behavior is



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-16 Thread via GitHub


crepererum commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2149310536


##
src/azure/client.rs:
##
@@ -1058,6 +1063,12 @@ fn to_list_result(value: ListResultInternal, prefix: 
Option<&str>) -> Result prefix.len()
 })
+.map(BlobInternal::try_from)
+.filter_map(|parsed| match parsed {
+Ok(parsed) => Some(parsed),
+Err(_) if ignore_unparsable_paths => None,
+Err(e) => panic!("cannot parse path: {e}"),

Review Comment:
   I read @kylebarron's :+1: on 
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2966380560
 as "current behavior is there's an error, NOT a panic". So I think the flag 
should ignore the error, but even if the flag isn't set, you MUST NOT panic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-13 Thread via GitHub


kylebarron commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2145334055


##
src/azure/builder.rs:
##
@@ -180,6 +180,8 @@ pub struct MicrosoftAzureBuilder {
 fabric_cluster_identifier: Option,
 /// The [`HttpConnector`] to use
 http_connector: Option>,
+/// When listing objects, ignore paths that throw and error when parsing

Review Comment:
   ```suggestion
   /// When listing objects, ignore paths that throw an error when parsing
   ```
   Might also be good to say "See [`AzureConfigKey::IgnoreUnparseablePaths`] 
for more details



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-12 Thread via GitHub


crepererum commented on PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2966380560

   What's the current behavior on `object_store`? I suppose it returns an error 
if the path is unparsable, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-01 Thread via GitHub


ttomasz commented on PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#issuecomment-2927448265

   > Apart from the notes I've left in-line, I'm wondering if we should even 
panic or should rather return an error. This would be in line with all kinds of 
unexpected server answers that we handle so far (e.g. weird status codes, 
broken bodies, TLS failures). The flag would then toggle between "error 
returned" and "error ignored" instead of "panic" and "no panic". WDYT?
   
   Thanks for the review.
   
   In general it would probably be better to pass the error for user to handle 
instead of throwing exception/panicking (or as I mentioned in the linked issue 
there could be a separate interface to get paths as they come i.e. strings 
instead of parsing them into Path objects). Then the flag wouldn't even be 
needed.
   
   But I don't know how that could be implemented without backwards 
incompatible changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-01 Thread via GitHub


ttomasz commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2119269548


##
src/azure/client.rs:
##
@@ -1058,6 +1063,9 @@ fn to_list_result(value: ListResultInternal, prefix: 
Option<&str>) -> Result prefix.len()
 })
+.map(BlobInternal::try_from)
+.filter(|parsed| !ignore_unparsable_paths || parsed.is_ok())
+.map(|parsed| parsed.unwrap())

Review Comment:
   Added suggested code



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-06-01 Thread via GitHub


ttomasz commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2119268885


##
src/azure/client.rs:
##
@@ -1540,4 +1565,135 @@ Time:2018-06-14T16:46:54.6040685Z\r
 assert_eq!("404", code);
 assert_eq!("The specified blob does not exist.", reason);
 }
+
+#[tokio::test]
+async fn test_list_blobs() {
+let fake_properties = BlobProperties {
+last_modified: Utc::now(),
+content_length: 8,
+content_type: "text/plain".to_string(),
+content_encoding: None,
+content_language: None,
+e_tag: Some("etag".to_string()),
+resource_type: Some("resource".to_string()),
+};
+let fake_result = ListResultInternal {
+prefix: None,
+max_results: None,
+delimiter: None,
+next_marker: None,
+blobs: Blobs {
+blob_prefix: vec![],
+blobs: vec![
+Blob {
+name: "blob0.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+Blob {
+name: "blob1.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+],
+},
+};
+let result = to_list_result(fake_result, None, false).unwrap();
+assert_eq!(result.common_prefixes.len(), 0);
+assert_eq!(result.objects.len(), 2);
+assert_eq!(result.objects[0].location, Path::from("blob0.txt"));
+assert_eq!(result.objects[1].location, Path::from("blob1.txt"));
+}
+
+#[tokio::test]
+#[should_panic]

Review Comment:
   Added expected error message



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Add option to Azure client to ignore unparsable paths [arrow-rs-object-store]

2025-05-27 Thread via GitHub


crepererum commented on code in PR #376:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/376#discussion_r2108835862


##
src/azure/client.rs:
##
@@ -1540,4 +1565,135 @@ Time:2018-06-14T16:46:54.6040685Z\r
 assert_eq!("404", code);
 assert_eq!("The specified blob does not exist.", reason);
 }
+
+#[tokio::test]
+async fn test_list_blobs() {
+let fake_properties = BlobProperties {
+last_modified: Utc::now(),
+content_length: 8,
+content_type: "text/plain".to_string(),
+content_encoding: None,
+content_language: None,
+e_tag: Some("etag".to_string()),
+resource_type: Some("resource".to_string()),
+};
+let fake_result = ListResultInternal {
+prefix: None,
+max_results: None,
+delimiter: None,
+next_marker: None,
+blobs: Blobs {
+blob_prefix: vec![],
+blobs: vec![
+Blob {
+name: "blob0.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+Blob {
+name: "blob1.txt".to_string(),
+version_id: None,
+is_current_version: None,
+deleted: None,
+properties: fake_properties.clone(),
+metadata: None,
+},
+],
+},
+};
+let result = to_list_result(fake_result, None, false).unwrap();
+assert_eq!(result.common_prefixes.len(), 0);
+assert_eq!(result.objects.len(), 2);
+assert_eq!(result.objects[0].location, Path::from("blob0.txt"));
+assert_eq!(result.objects[1].location, Path::from("blob1.txt"));
+}
+
+#[tokio::test]
+#[should_panic]

Review Comment:
   ```suggestion
   #[should_panic(expected = "???")]
   ```
   
   I find it helpful to check the panic message (or a substring of it) to 
ensure it's providing some useful user error.



##
src/azure/client.rs:
##
@@ -1058,6 +1063,9 @@ fn to_list_result(value: ListResultInternal, prefix: 
Option<&str>) -> Result prefix.len()
 })
+.map(BlobInternal::try_from)
+.filter(|parsed| !ignore_unparsable_paths || parsed.is_ok())
+.map(|parsed| parsed.unwrap())

Review Comment:
   I think instead of rolling this into two iterator steps, this would be 
easier to reason about:
   
   ```suggestion
   .filter_map(|parsed| {
   match parsed {
   Ok(parsed) => Some(parsed),
   Err(_) if ignore_unparsable_paths => None,
   Err(e) => panic!("cannot parse path: {e}"),
   }
   })
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]