Github user olarayej commented on the pull request: https://github.com/apache/spark/pull/8920#issuecomment-144809819 @shivaram @felixcheung @sun-rui Thanks for your feedback! I totally see your point with the naming (sort vs. arrange), but @NarineK's implementation has two advantages: 1) It supports string column names in both asc and desc order. In the current SparkR's implementation of arrange(), I couldn't do that: arrange(df, desc("Species")) # fails 2) Boolean parameter 'decreasing' is useful. Right now, if you were to sort by 100 columns, all of them in descending order, you'll need to write 100 times, for each column: desc(data$col1), ...., desc(data$col100), whereas in @NarineK's implementation, it will suffice to specify decreasing=T. I'm aware that plyr also takes functions asc/desc, probably because R was not designed with big data in mind. We've seen customer use cases with hundreds of thousands of columns. Bottom line: I think these are two valid additions to Spark R, and since the code is ready and tested, it won't hurt. Let the user decide which function to use.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org