Looks like what you want is to add a column that, when ordered by that
column, the current order of the dateframe is preserved.
All you need is the monotonically_increasing_id() function:
spark.range(0, 10, 1, 5).withColumn("row",
monotonically_increasing_id()).show()
+---+---+
| id|
want to maintain the order of the rows in the data frame in Pyspark. Is
there any way to achieve this for this function here we have the row ID
which will give numbering to each row. Currently, the below function
results in the rearrangement of the row in the data frame.
def createRowIdColumn( new