Hi,

In my spark application, I am loading some rows from database into Spark
RDDs
Each row has several fields, and a string key. Due to my requirements I
need to work with consecutive numeric ids (starting from 1 to N, where N is
the number of unique keys) instead of string keys . Also several rows can
have same string key .

In spark context, how I can map each row into (Numeric_Key, OriginalRow) as
map/reduce  tasks such that rows with same original string key get same
numeric consecutive key?

Any hints?

best,
/Shahab

Reply via email to