[GitHub] [spark] itholic commented on a diff in pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

via GitHub Wed, 30 Aug 2023 11:34:48 -0700


itholic commented on code in PR #40420:
URL: https://github.com/apache/spark/pull/40420#discussion_r1310662293



##########
python/pyspark/pandas/datetimes.py:
##########
@@ -116,26 +117,55 @@ def pandas_microsecond(s) -> ps.Series[np.int32]:  # 
type: ignore[no-untyped-def
     def nanosecond(self) -> "ps.Series":
         raise NotImplementedError()
 
-    # TODO(SPARK-42617): Support isocalendar.week and replace it.
-    # See also https://github.com/pandas-dev/pandas/pull/33595.
-    @property
-    def week(self) -> "ps.Series":
+    def isocalendar(self) -> "ps.DataFrame":
         """
-        The week ordinal of the year.
+        Calculate year, week, and day according to the ISO 8601 standard.
 
-        .. deprecated:: 3.4.0
-        """
-        warnings.warn(
-            "weekofyear and week have been deprecated.",
-            FutureWarning,
-        )
-        return self._data.spark.transform(lambda c: 
F.weekofyear(c).cast(LongType()))
+            .. versionadded:: 4.0.0
 
-    @property
-    def weekofyear(self) -> "ps.Series":
-        return self.week
+        Returns
+        -------
+        DataFrame
+            With columns year, week and day.
 
-    weekofyear.__doc__ = week.__doc__
+        Examples
+        --------
+        >>> dfs = ps.from_pandas(pd.date_range(start='2019-12-29', freq='D', 
periods=4).to_series())
+        >>> dfs.dt.isocalendar()
+                    year  week  day
+        2019-12-29  2019    52    7
+        2019-12-30  2020     1    1
+        2019-12-31  2020     1    2
+        2020-01-01  2020     1    3
+        >>> dfs.dt.isocalendar().week
+        2019-12-29    52
+        2019-12-30     1
+        2019-12-31     1
+        2020-01-01     1
+        Name: week, dtype: int64
+        """
+
+        return_types = [self._data.index.dtype, int, int, int]
+
+        def pandas_isocalendar(  # type: ignore[no-untyped-def]
+            pdf,
+        ) -> ps.DataFrame[return_types]:  # type: ignore[valid-type]
+            # cast to int64 due to UInt32 is not supported by spark

Review Comment:
   > cast to int64 due to UInt32 is not supported by spark
    
   Is this mean that the result is different from pandas ?? If so, let's add a 
"Note" to the docstring so that users recognize this difference instead of just 
adding the comment here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a diff in pull request #40420: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

Reply via email to