Sequences are supported by MR integration, but I'm not sure if their usage by the Spark integration would cause any issues.
On Monday, August 17, 2015, Josh Mahonin <[email protected]> wrote: > Hi Satya, > > I don't believe sequences are supported by the broader Phoenix map-reduce > integration, which the phoenix-spark module uses under the hood. > > One workaround that would give you sequential IDs, is to use the > 'zipWithIndex' method on the underlying Spark RDD, with a small 'map()' > operation to unpack / reorganize the tuple, before saving it to Phoenix. > > Good luck! > > Josh > > On Sat, Aug 15, 2015 at 10:02 AM, Ns G <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> Hi All, >> >> I hope that someone will reply to this email as all my previous emails >> have been unanswered. >> >> I have 10-20 Million records in file and I want to insert it through >> Phoenix-Spark. >> The table primary id is generated by a sequence. So, every time an upsert >> is done, the sequence Id gets generated. >> >> Now I want to implement this in Spark and more precisely using data >> frames. Since RDDs are immutables, How can I add sequence to the rows in >> dataframe? >> >> Thanks for any help or direction or suggestion. >> >> Satya >> > >
