dengziming commented on code in PR #38984:
URL: https://github.com/apache/spark/pull/38984#discussion_r1051305047


##########
python/pyspark/sql/connect/dataframe.py:
##########
@@ -875,6 +875,30 @@ def to_jcols(
 
     melt = unpivot
 
+    def hint(self, name: str, *params: Any) -> "DataFrame":
+        """
+        Specifies some hint on the current DataFrame. As an example, the 
following code specifies
+        that one of the plan can be broadcasted: 
`df1.join(df2.hint("broadcast"))`
+
+        .. versionadded:: 3.4.0
+
+        Parameters
+        ----------
+        name: str
+            the name of the hint, for example, "broadcast", "SHUFFLE_MERGE" 
and "shuffle_hash".
+        params: tuple
+            the parameters of the hint

Review Comment:
   Yeah, It's worth improving the docs of the parameters.



##########
python/pyspark/sql/connect/plan.py:
##########
@@ -343,6 +343,51 @@ def _repr_html_(self) -> str:
         """
 
 
+class Hint(LogicalPlan):
+    """Logical plan object for a Hint operation."""
+
+    def __init__(self, child: Optional["LogicalPlan"], name: str, params: 
List[Any]) -> None:
+        super().__init__(child)
+        self.name = name
+        self.params = params
+
+    def _convert_value(self, v: Any) -> proto.Expression.Literal:
+        value = proto.Expression.Literal()
+        if v is None:
+            value.null = True
+        elif isinstance(v, int):
+            value.integer = v
+        else:
+            value.string = v
+        return value

Review Comment:
   I improved the error handing logic here, and there are 4 occurrence of this 
similar logic, I'm planing to refactor it to reuse the existing code.



##########
python/pyspark/sql/tests/connect/test_connect_basic.py:
##########
@@ -829,6 +829,13 @@ def test_with_columns(self):
             .toPandas(),
         )
 
+    def test_hint(self):
+        # SPARK-41349: Test hint

Review Comment:
   Good catch, I added 4 more test cases for it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to