Thanks Enrico. I meant one hash of each single row in extra column something like this.. val newDs = typedRows.withColumn("hash", hash( typedRows.columns.map(col): _*))
On Mon, Mar 2, 2020 at 3:51 PM Enrico Minack <m...@enrico.minack.dev> wrote: > Well, then apply md5 on all columns: > > ds.select(ds.columns.map(col) ++ ds.columns.map(column => > md5(col(column)).as(s"$column hash")): _*).show(false) > > Enrico > > Am 02.03.20 um 11:10 schrieb Chetan Khatri: > > Thanks Enrico > I want to compute hash of all the columns value in the row. > > On Fri, Feb 28, 2020 at 7:28 PM Enrico Minack <m...@enrico.minack.dev> > wrote: > >> This computes the md5 hash of a given column id of Dataset ds: >> >> ds.withColumn("id hash", md5($"id")).show(false) >> >> Test with this Dataset ds: >> >> import org.apache.spark.sql.types._ >> val ds = spark.range(10).select($"id".cast(StringType)) >> >> Available are md5, sha, sha1, sha2 and hash: >> https://spark.apache.org/docs/2.4.5/api/sql/index.html >> >> Enrico >> >> >> Am 28.02.20 um 13:56 schrieb Chetan Khatri: >> > Hi Spark Users, >> > How can I compute Hash of each row and store in new column at >> > Dataframe, could someone help me. >> > >> > Thanks >> >> >> >