[ https://issues.apache.org/jira/browse/SPARK-38627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511763#comment-17511763 ]
Prakhar Sandhu commented on SPARK-38627: ---------------------------------------- Hi [~hyukjin.kwon] , Nice ^^ # Did it work on spark 3.3? # What environment are you using? I have set up a conda environment in my local system with spark 3.2. I specified the numpy explicitly {code:java} df = pd.DataFrame({ 'Date1': rng.to_numpy, 'Date2': rng.to_numpy}) File "C:\Users\abc\Anaconda3\envs\env2\lib\site-packages\pyspark\pandas\frame.py", line 519, in __init__ pdf = pd.DataFrame(data=data, index=index, columns=columns, dtype=dtype, copy=copy) File "C:\Users\abc\Anaconda3\envs\env2\lib\site-packages\pandas\core\frame.py", line 435, in __init__ mgr = init_dict(data, index, columns, dtype=dtype) File "C:\Users\abc\Anaconda3\envs\env2\lib\site-packages\pandas\core\internals\construction.py", line 254, in init_dict return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype) File "C:\Users\abc\Anaconda3\envs\env2\lib\site-packages\pandas\core\internals\construction.py", line 64, in arrays_to_mgr index = extract_index(arrays) File "C:\Users\abc\Anaconda3\envs\env2\lib\site-packages\pandas\core\internals\construction.py", line 355, in extract_index raise ValueError("If using all scalar values, you must pass an index") ValueError: If using all scalar values, you must pass an index {code} > TypeError: Datetime subtraction can only be applied to datetime series > ---------------------------------------------------------------------- > > Key: SPARK-38627 > URL: https://issues.apache.org/jira/browse/SPARK-38627 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.2.1 > Reporter: Prakhar Sandhu > Priority: Major > > I am trying to replace pandas with pyspark.pandas library, when I tried this : > pdf is a pyspark.pandas dataframe > {code:java} > pdf["date_diff"] = (pdf["date1"] - pdf["date2"])/pdf.Timedelta(days=30){code} > I got the below error : > {code:java} > File > "C:\Users\abc\Anaconda3\envs\test\lib\site-packages\pyspark\pandas\data_type_ops\datetime_ops.py", > line 75, in sub > raise TypeError("Datetime subtraction can only be applied to datetime > series.") {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org