[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770658#comment-17770658 ]
ZygD edited comment on SPARK-23074 at 9/30/23 7:58 AM: ------------------------------------------------------- [~gurwls223] [~Tagar] The problem is {*}not solved{*}! This was incorrectly closed. [The linked closed issue|https://issues.apache.org/jira/browse/SPARK-24042] is about arrays, while this is not. was (Author: JIRAUSER286869): [~gurwls223] [~Tagar] The problem is {*}not solved{*}! This was incorrectly closed. [The linked closed|https://issues.apache.org/jira/browse/SPARK-24042] issue is about arrays, while this is not. > Dataframe-ified zipwithindex > ---------------------------- > > Key: SPARK-23074 > URL: https://issues.apache.org/jira/browse/SPARK-23074 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 2.3.0 > Reporter: Ruslan Dautkhanov > Priority: Minor > Labels: bulk-closed, dataframe, rdd > > Would be great to have a daraframe-friendly equivalent of rdd.zipWithIndex(): > {code:java} > import org.apache.spark.sql.DataFrame > import org.apache.spark.sql.types.{LongType, StructField, StructType} > import org.apache.spark.sql.Row > def dfZipWithIndex( > df: DataFrame, > offset: Int = 1, > colName: String = "id", > inFront: Boolean = true > ) : DataFrame = { > df.sqlContext.createDataFrame( > df.rdd.zipWithIndex.map(ln => > Row.fromSeq( > (if (inFront) Seq(ln._2 + offset) else Seq()) > ++ ln._1.toSeq ++ > (if (inFront) Seq() else Seq(ln._2 + offset)) > ) > ), > StructType( > (if (inFront) Array(StructField(colName,LongType,false)) else > Array[StructField]()) > ++ df.schema.fields ++ > (if (inFront) Array[StructField]() else > Array(StructField(colName,LongType,false))) > ) > ) > } > {code} > credits: > [https://stackoverflow.com/questions/30304810/dataframe-ified-zipwithindex] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org