[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

felixcheung Fri, 09 Nov 2018 00:51:27 -0800

Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22954#discussion_r232167926
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
    @@ -225,4 +226,25 @@ private[sql] object SQLUtils extends Logging {
         }
         sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray
       }
    +
    +  /**
    +   * R callable function to read a file in Arrow stream format and create 
a `RDD`
    +   * using each serialized ArrowRecordBatch as a partition.
    +   */
    +  def readArrowStreamFromFile(
    +      sparkSession: SparkSession,
    +      filename: String): JavaRDD[Array[Byte]] = {
    +    ArrowConverters.readArrowStreamFromFile(sparkSession.sqlContext, 
filename)
    --- End diff --
    
    what's the advantage of adding this wrapper here - I've thinking to 
eliminate most of these if possible - and just use callJMethod on 
`ArrowConverters` say?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

Reply via email to