[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754782336



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   Yes, old versions of pandas fails same with/without this PR. This 
condition just to skip old versions of pandas tests, and enable all testcase 
for decimal('nan') with pandas v1.3.0+ version.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754782336



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   Yes, old versions of pandas fails same with/without this PR. This 
condition just to skip old versions of pandas tests, and enable all testcase 
for decimal('nan') with pandas v1.3.0+ version.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   Current:
   - pandas on spark with old versions of pandas fails same with/without this 
PR. (it's very wired)
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case with new version of pandas.

##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very wired)
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case with new version of pandas.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very wired)
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case of  pandas on spark with new version of 
pandas.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very wired)
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case of  pandas on spark with pandas v1.3.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very wired, 
https://gist.github.com/Yikun/6b88920652fc535b336a03746fe3b04f)
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case of  pandas on spark with pandas v1.3.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-22 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very wired, 
https://gist.github.com/Yikun/6b88920652fc535b336a03746fe3b04f)
   - pandas on spark with v1.3.0+ of pandas passed with this PR.
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case of  pandas on spark with pandas v1.3.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-23 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very weird, 
https://gist.github.com/Yikun/6b88920652fc535b336a03746fe3b04f)
   - pandas on spark with v1.3.0+ of pandas passed with this PR.
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case of  pandas on spark with pandas v1.3.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34687: [SPARK-36231][PYTHON] Support arithmetic operations of decimal(nan) series

2021-11-23 Thread GitBox


Yikun commented on a change in pull request #34687:
URL: https://github.com/apache/spark/pull/34687#discussion_r754789113



##
File path: python/pyspark/pandas/tests/data_type_ops/testing_utils.py
##
@@ -49,8 +49,15 @@ def numeric_pdf(self):
 dtypes = [np.int32, int, np.float32, float]
 sers = [pd.Series([1, 2, 3], dtype=dtype) for dtype in dtypes]
 sers.append(pd.Series([decimal.Decimal(1), decimal.Decimal(2), 
decimal.Decimal(3)]))
+sers.append(pd.Series([1, 2, np.nan], dtype=float))
 pdf = pd.concat(sers, axis=1)
-pdf.columns = [dtype.__name__ for dtype in dtypes] + ["decimal"]
+pdf.columns = [dtype.__name__ for dtype in dtypes] + [
+"decimal",
+"float_nan",
+]
+if LooseVersion(pd.__version__) >= LooseVersion("1.3.0"):

Review comment:
   - pandas on spark with old versions of pandas fails same with/without 
this PR. (it's very weird, 
https://gist.github.com/Yikun/6b88920652fc535b336a03746fe3b04f), I added note: 
`# Skip decimal_nan test before v1.3.0, it not supported by pandas on spark 
yet.`
   - pandas on spark with v1.3.0+ of pandas passed with this PR.
   - old versions of  pandas (native pandas) support decimal.
   
   This PR only enable the test case of  pandas on spark with pandas v1.3.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org