Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20727#discussion_r174998682 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextOptions.scala --- @@ -39,9 +39,12 @@ private[text] class TextOptions(@transient private val parameters: CaseInsensiti */ val wholeText = parameters.getOrElse(WHOLETEXT, "false").toBoolean + val lineSeparator: String = parameters.getOrElse(LINE_SEPARATOR, "\n") + require(lineSeparator.nonEmpty, s"'$LINE_SEPARATOR' cannot be an empty string.") } private[text] object TextOptions { val COMPRESSION = "compression" val WHOLETEXT = "wholetext" + val LINE_SEPARATOR = "lineSep" --- End diff -- My reason is to refer other places so that practically other users feel comfortable, which I usually put more importances. I really don't want to spend time on research why the other references used the term "line". If we think about the plain text, CSV or JSON, the term "line" can be correct in a way. We documented http://jsonlines.org/ (even this reference used the term "line"). I think, for example, the line can be defined by its separator. https://github.com/apache/spark/blob/c36fecc3b416c38002779c3cf40b6a665ac4bf13/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala#L1645
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org