[ 
https://issues.apache.org/jira/browse/SPARK-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533918#comment-14533918
 ] 

Sun Rui commented on SPARK-7435:
--------------------------------

[~shivaram] Thank you for pointing out the reason for such situation. As 
documented in 
https://stat.ethz.ch/R-manual/R-devel/library/methods/html/show.html, show() is 
invoked for automatic printing of an S4 object, something like toString() in 
Scala and __repr__() in pySpark. I agree that we keep show() as is.

As for showDF() (counterpart of show() in Scala/pySpark), it serves different 
goal from head(). head() is for retrieving row objects, while showDF() is for 
printing rows in tabular form. Following code is to demonstrate the difference:

    df<-createDataFrame(sqlCtx, list(1,2,3))
    head(df)
          _1
        1  1
        2  2
        3  3
    cat(showDF(df))
        +---+
        | _1|
        +---+
        |1.0|
        |2.0|
        |3.0|
        +---+

I would suggest keep showDF(). But currently showDF() has a problem that it 
does honor the escaping characters in the string returned by Scala 
DF.showString(). so its output is like :
    "+---+\n| _1|\n+---+\n|1.0|\n|2.0|\n|3.0|\n+---+\n"
I think we can modify it by using R cat() function to print the string.



> Make DataFrame.show() consistent with that of Scala and pySpark
> ---------------------------------------------------------------
>
>                 Key: SPARK-7435
>                 URL: https://issues.apache.org/jira/browse/SPARK-7435
>             Project: Spark
>          Issue Type: Improvement
>          Components: SparkR
>    Affects Versions: 1.4.0
>            Reporter: Sun Rui
>            Priority: Blocker
>
> Currently in SparkR, DataFrame has two methods show() and showDF(). show() 
> prints the DataFrame column names and types and showDF() prints the first 
> numRows rows of a DataFrame.
> In Scala and pySpark, show() is used to prints rows of a DataFrame. 
> We'd better keep API consistent unless there is some important reason. So 
> propose to interchange the names (show() and showDF()) in SparkR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to