pankajkoti commented on code in PR #36085: URL: https://github.com/apache/airflow/pull/36085#discussion_r1417131009
########## airflow/providers/weaviate/operators/weaviate.py: ########## @@ -40,9 +44,11 @@ class WeaviateIngestOperator(BaseOperator): :param conn_id: The Weaviate connection. :param class_name: The Weaviate class to be used for storing the data objects into. + :param input_data: The list of dicts or pandas dataframe representing Weaviate data objects to generate + embeddings on (or provides custom vectors) and store them in the Weaviate class. :param input_json: The JSON representing Weaviate data objects to generate embeddings on (or provides Review Comment: Can we update the docstring to mark this as deprecated? ########## airflow/providers/weaviate/hooks/weaviate.py: ########## @@ -135,22 +139,40 @@ def create_schema(self, schema_json: dict[str, Any]) -> None: client = self.conn client.schema.create(schema_json) + @staticmethod + def check_http_error_should_retry(exc: BaseException): + return isinstance(exc, requests.HTTPError) and not exc.response.ok + def batch_data( - self, class_name: str, data: list[dict[str, Any]], batch_config_params: dict[str, Any] | None = None + self, + class_name: str, + data: list[dict[str, Any]] | pd.DataFrame, + batch_config_params: dict[str, Any] | None = None, + vector_col: str = "Vector", + no_retry_attempts_per_object: int = 5, Review Comment: ```suggestion retry_attempts_per_object: int = 5, ``` I am guessing you meant no=number, but looking at the first time I read it as NO (some negation). ########## airflow/providers/weaviate/hooks/weaviate.py: ########## @@ -135,22 +139,40 @@ def create_schema(self, schema_json: dict[str, Any]) -> None: client = self.conn client.schema.create(schema_json) + @staticmethod + def check_http_error_should_retry(exc: BaseException): + return isinstance(exc, requests.HTTPError) and not exc.response.ok + def batch_data( - self, class_name: str, data: list[dict[str, Any]], batch_config_params: dict[str, Any] | None = None + self, + class_name: str, + data: list[dict[str, Any]] | pd.DataFrame, + batch_config_params: dict[str, Any] | None = None, + vector_col: str = "Vector", Review Comment: I think it would make sense to add docstring for this public method of the hook where we explain what these params mean. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org