Taragolis commented on code in PR #33504:
URL: https://github.com/apache/airflow/pull/33504#discussion_r1298836748
##########
airflow/providers/amazon/aws/hooks/s3.py:
##########
@@ -823,7 +817,7 @@ def _is_in_period(input_date: datetime) -> bool:
if "Contents" in page:
new_keys = page["Contents"]
if _apply_wildcard:
- new_keys = (k for k in new_keys if
fnmatch.fnmatch(k["Key"], _original_prefix))
Review Comment:
I son't think that is correct to use `Path` on non FS related objects, like
S3 Key. In additional actual type object depend on OS.
```python
from pathlib import Path
import fnmatch
sample_s3_keys = {
"/foo/bar/spam/egg", # this should be invalid key because s3 should not
be started with `/`
"foo/bar/spam/egg",
"foo////bar///spam/egg"
}
for key in sample_s3_keys:
path_key = Path(key)
print(f"{type(path_key).__name__}, {path_key},
{path_key.match('foo/bar*')}")
print(f"{type(key).__name__}, {key}, {fnmatch.fnmatch(key, 'foo/bar*')}")
print()
```
Linux/macOS output
```console
PosixPath, /foo/bar/spam/egg, False
str, /foo/bar/spam/egg, False
PosixPath, foo/bar/spam/egg, False
str, foo/bar/spam/egg, True
PosixPath, foo/bar/spam/egg, False
str, foo////bar///spam/egg, False
```
Windows
```console
WindowsPath, \foo\bar\spam\egg, False
str, /foo/bar/spam/egg, False
WindowsPath, foo\bar\spam\egg, False
str, foo/bar/spam/egg, True
WindowsPath, foo\bar\spam\egg, False
str, foo////bar///spam/egg, False
```
##########
airflow/providers/sftp/hooks/sftp.py:
##########
@@ -379,11 +379,7 @@ def get_file_by_pattern(self, path, fnmatch_pattern) ->
str:
:param fnmatch_pattern: The pattern that will be matched with `fnmatch`
:return: string containing the first found file, or an empty string if
none matched
"""
- for file in self.list_directory(path):
- if fnmatch(file, fnmatch_pattern):
- return file
-
- return ""
+ return next((file for file in self.list_directory(path) if
Path(file).match(fnmatch_pattern)), "")
Review Comment:
Same here
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]