[GitHub] spark pull request #14087: [SPARK-16411][SQL][STREAMING] Add textFile to Str...

holdenk Fri, 08 Jul 2016 11:43:24 -0700

Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14087#discussion_r70121449
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala 
---
    @@ -281,6 +281,31 @@ final class DataStreamReader 
private[sql](sparkSession: SparkSession) extends Lo
       @Experimental
       def text(path: String): DataFrame = format("text").load(path)
     
    +  /**
    +   * Loads text files and returns a [[Dataset]] of String. The underlying 
schema of the Dataset
    +   * contains a single string column named "value".
    +   *
    +   * If the directory structure of the text files contains partitioning 
information, those are
    +   * ignored in the resulting Dataset. To include partitioning information 
as columns, use `text`.
    +   *
    +   * Each line in the text files is a new element in the resulting 
Dataset. For example:
    +   * {{{
    +   *   // Scala:
    +   *   spark.read.textFile("/path/to/spark/README.md")
    +   *
    +   *   // Java:
    +   *   spark.read().textFile("/path/to/spark/README.md")
    +   * }}}
    +   *
    +   * @param path input path
    +   * @since 2.0.0
    +   */
    +  def textFile(path: String): Dataset[String] = {
    +    if (userSpecifiedSchema.nonEmpty) {
    +      throw new AnalysisException("User specified schema not supported 
with `textFile`")
    --- End diff --
    
    Since this check is presumably copied from the similar function in 
DataFrameReader, we should probably keep the exception the same as 
DataFrameReader (so either update it too or leave this as is).
    Also In the SQL code base we use "User specified"  24 times and 
"User-specified" 5 times.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14087: [SPARK-16411][SQL][STREAMING] Add textFile to Str...

Reply via email to