BlakeOrth commented on code in PR #18146:
URL: https://github.com/apache/datafusion/pull/18146#discussion_r2504706437


##########
datafusion/core/tests/datasource/object_store_access.rs:
##########
@@ -194,17 +183,8 @@ async fn query_partitioned_csv_file() {
     +---------+-------+-------+---+----+-----+
     ------- Object Store Request Summary -------
     RequestCountingObjectStore()
-    Total Requests: 11
-    - LIST (with delimiter) prefix=data
-    - LIST (with delimiter) prefix=data/a=1
-    - LIST (with delimiter) prefix=data/a=2
-    - LIST (with delimiter) prefix=data/a=3
-    - LIST (with delimiter) prefix=data/a=1/b=10
-    - LIST (with delimiter) prefix=data/a=2/b=20
-    - LIST (with delimiter) prefix=data/a=3/b=30
-    - LIST (with delimiter) prefix=data/a=1/b=10/c=100
-    - LIST (with delimiter) prefix=data/a=2/b=20/c=200
-    - LIST (with delimiter) prefix=data/a=3/b=30/c=300
+    Total Requests: 2
+    - LIST prefix=data
     - GET  (opts) path=data/a=2/b=20/c=200/file_2.csv

Review Comment:
   > * Implement the relevant prefix filtering on the client (e.g if we have 
cached `LIST /path/to/foo` and then get a request for `LIST /path/to/foo/bar` 
we could try and filter / prefix match the entry in the cache)
   > 
   > * Not handle sub-prefix matches
   
   The 2nd option here would certainly be the most simple, but it seems sad. I 
think the first option sounds doable, although it's not clear to me if doing 
prefix filtering in the cache is any better than simply filtering the stream 
for prefixes like the code in this PR is already doing.
   
   Cache entry is another interesting problem if we allow listing of specific 
prefixes, because `path/to/foo/bar` cannot reliably fulfill a request for 
`path/to/foo`, but if a user is repeatedly querying `path/to/foo/bar` it sure 
would be nice to allow those results to be cached.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to