xuang7 commented on code in PR #5570:
URL: https://github.com/apache/texera/pull/5570#discussion_r3444295364
##########
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/PythonCodegenBase.scala:
##########
@@ -702,6 +760,22 @@ object PythonCodegenBase {
| def _image_input_as_base64(self, image_bytes):
| return base64.b64encode(image_bytes).decode("utf-8")
|
+ | def _read_audio_input(self):
+ | audio_input = str(self.AUDIO_INPUT or "").strip()
+ | if audio_input.startswith("data:"):
+ | _, encoded = audio_input.split(",", 1)
+ | return base64.b64decode(encoded)
+ | if audio_input.startswith("http://") or
audio_input.startswith("https://"):
+ | resp = requests.get(audio_input, timeout=120)
+ | resp.raise_for_status()
+ | return resp.content
+ | if not os.path.exists(audio_input):
+ | raise FileNotFoundError(f"Audio file not found at path:
{audio_input}")
+ | if not os.path.isfile(audio_input):
+ | raise ValueError(f"Audio input path is not a file:
{audio_input}")
Review Comment:
_read_audio_input does not follow the fixed image-input pattern yet. It
still reads arbitrary local files through open(audio_input) and fetches
arbitrary URLs using raw requests.get, including http:// URLs.
Could we remove the local-file branch and disallow http://, then route
remote fetching through the existing _fetch_remote_url helper instead?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]