I previously responded to your post on user@: https://lists.apache.org/thread.html/5c10b7edf982ef63d1d1d70545e3fe2716d00628ff5c2a7854383413@%3Cuser.beam.apache.org%3E
I've also mirrored my response on StackOverflow: https://stackoverflow.com/a/53771980/33791 On Thu, Dec 13, 2018 at 4:21 PM Chak-Pong Chung <[email protected]> wrote: > Hello everyone! > > I asked the following question and think I might get some suggestions > whether what I want is doable or not. > > > https://stackoverflow.com/questions/53746046/how-can-i-implement-zipwithindex-like-spark-in-apache-beam/53747612#53747612 > > If I can get `PCollection` id and the number of (contiguous)lines in each > `PCollection`, then I can calculate the row order within each > partition/`PCollection` first and then do prefix-sum to compute the offset > for each partition. This is doable in MPI or openMP since I can get the > id/rank of each processor/thread. > > Best, > Chak-Pong > -- Got feedback? tinyurl.com/swegner-feedback
