uranusjr commented on code in PR #36085:
URL: https://github.com/apache/airflow/pull/36085#discussion_r1418707392


##########
airflow/providers/weaviate/hooks/weaviate.py:
##########
@@ -135,22 +141,52 @@ def create_schema(self, schema_json: dict[str, Any]) -> 
None:
         client = self.conn
         client.schema.create(schema_json)
 
+    @staticmethod
+    def check_http_error_should_retry(exc: BaseException):
+        return isinstance(exc, requests.HTTPError) and not exc.response.ok
+
     def batch_data(
-        self, class_name: str, data: list[dict[str, Any]], 
batch_config_params: dict[str, Any] | None = None
+        self,
+        class_name: str,
+        data: list[dict[str, Any]] | pd.DataFrame,
+        batch_config_params: dict[str, Any] | None = None,
+        vector_col: str = "Vector",
+        retry_attempts_per_object: int = 5,
     ) -> None:
+        """
+        Add multiple objects or object references at once into weaviate.
+
+        :param class_name: The name of the class that objects belongs to.
+        :param data: list or dataframe of objects we want to add.
+        :param batch_config_params: dict of batch configuration option.
+            .. seealso:: `batch_config_params options 
<https://weaviate-python-client.readthedocs.io/en/v3.25.3/weaviate.batch.html#weaviate.batch.Batch.configure>`__
+        :param vector_col: name of the column containing the vector.
+        :param retry_attempts_per_object: number of time to try in case of 
failure before giving up.
+        """
+        import pandas as pd

Review Comment:
   Since Weaviate does not strictly require Pandas to function, it would be 
better to do something like
   
   ```python
   with contextlib.suppress(ImportError):
       import pandas
   
       if isinstance(data, pandas.DataFrame):
           ...
   ```
   
   If the import fails, `data` can never be a DataFrame (it’s impossible to 
create without Pandas installed), so we can safely guard the check in a 
try-except.



##########
airflow/providers/weaviate/hooks/weaviate.py:
##########
@@ -135,22 +141,52 @@ def create_schema(self, schema_json: dict[str, Any]) -> 
None:
         client = self.conn
         client.schema.create(schema_json)
 
+    @staticmethod
+    def check_http_error_should_retry(exc: BaseException):
+        return isinstance(exc, requests.HTTPError) and not exc.response.ok
+
     def batch_data(
-        self, class_name: str, data: list[dict[str, Any]], 
batch_config_params: dict[str, Any] | None = None
+        self,
+        class_name: str,
+        data: list[dict[str, Any]] | pd.DataFrame,
+        batch_config_params: dict[str, Any] | None = None,
+        vector_col: str = "Vector",
+        retry_attempts_per_object: int = 5,
     ) -> None:
+        """
+        Add multiple objects or object references at once into weaviate.
+
+        :param class_name: The name of the class that objects belongs to.
+        :param data: list or dataframe of objects we want to add.
+        :param batch_config_params: dict of batch configuration option.
+            .. seealso:: `batch_config_params options 
<https://weaviate-python-client.readthedocs.io/en/v3.25.3/weaviate.batch.html#weaviate.batch.Batch.configure>`__
+        :param vector_col: name of the column containing the vector.
+        :param retry_attempts_per_object: number of time to try in case of 
failure before giving up.
+        """
+        import pandas as pd

Review Comment:
   Since Weaviate does not strictly require Pandas to function, it would be 
better to do something like
   
   ```python
   with contextlib.suppress(ImportError):
       import pandas
   
       if isinstance(data, pandas.DataFrame):
           ...
   ```
   
   If the import fails, `data` can never be a DataFrame (it’s impossible to 
create without Pandas installed), so we can safely guard the check in a 
try-except.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to