Hello everyone, Can anyone explain what are exactly these two parameters (startphase and endphase) and how to use them? I'm trying to launch a RowSimilarity job on a 50K row matrix (100 columns) with cosine similarity and default startphase and endphase parameters and I'm getting a extremely poor performance on a quite big cluster (After 16 hours, only reached 3% of the proccess) and I think that this could have something to do with startphase and endphase parameters. What do you think? How do these paremeters affect the RowSimilarity job?
Thanks in advance. Fernando.
