[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

olarayej Thu, 01 Oct 2015 11:29:41 -0700

Github user olarayej commented on the pull request:

    https://github.com/apache/spark/pull/8920#issuecomment-144809819
  
    @shivaram @felixcheung @sun-rui Thanks for your feedback!
    
    I totally see your point with the naming (sort vs. arrange), but @NarineK's 
implementation has two advantages:
    
    1) It supports string column names in both asc and desc order. In the 
current SparkR's implementation of arrange(), I couldn't do that:
    
    arrange(df, desc("Species")) # fails
    
    2) Boolean parameter 'decreasing' is useful. Right now, if you were to sort 
by 100 columns, all of them in descending order, you'll need to write 100 
times, for each column: desc(data$col1), ...., desc(data$col100), whereas in 
@NarineK's implementation, it will suffice to specify decreasing=T.
    
    I'm aware that plyr also takes functions asc/desc, probably because R was 
not designed with big data in mind. We've seen customer use cases with hundreds 
of thousands of columns.
    
    Bottom line: I think these are two valid additions to Spark R, and since 
the code is ready and tested, it won't hurt. Let the user decide which function 
to use.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

Reply via email to