[ https://issues.apache.org/jira/browse/SPARK-37491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472056#comment-17472056 ]
pralabhkumar commented on SPARK-37491: -------------------------------------- Lets take example of pser = pd.Series([2, 1, np.nan, 4], index=[10, 20, 30, 40], name="Koalas") pser.asof([5,20]) will give output [Nan , 1] While ps.from_pandas(pser).asof[5,20] will give output [Nan, 2] *Explanation* Data frame created after applying condition. F.when(index_scol <= SF.lit(index).cast(index_type) Without applying max aggregation +-------------+--------------+-----------------+ |col_5 |col_25 |__index_level_0__| +-------------+--------------+-----------------+ |null|2.0|10 | |null|1.0|20 | |null|null|30 | |null|null|40 | +-------------+--------------+-----------------+ Since we are taking max , output is coming 2. Ideally what we need is the last non null value or each col with increasing order of __index_level_0__. Now to implement the logic . What I planning to do is create a below DF from the above DF , using explode , partition and row_number __index_level_0__. Identifier value row_number 40 col_5 null. 1 30 col_5 null 2 20 col_5 null 3 10 col_5 null 4 40 col_20 2 1 30 col_20 1 2 20 col_20 null 3 10 col_20 null 4 Then filter on row_number=1 . There are other things to take care , but majority of the logic is this . Please let me know if its in correct direction ( This is actually passing all the asof test cases ,including the case which is described in jira. ) . [~itholic] > Fix Series.asof when values of the series is not sorted > ------------------------------------------------------- > > Key: SPARK-37491 > URL: https://issues.apache.org/jira/browse/SPARK-37491 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.3.0 > Reporter: dch nguyen > Priority: Major > > https://github.com/apache/spark/pull/34737#discussion_r758223279 -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org