Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/22379 > Out of curiosity, is this one related with an actual usecase Maxim? or is this proposed for API consistency? This is actual use case when users received CSV content dumped from another DBs and stored as one of columns (in Kafka for example). When they read the data back by Spark, they need to parse strings in the columns somehow. Usually they do that manually by using string column functions which is error prone especially in the cases of quoted values. In general you can extract the column and convert it to `Dataset[String]`, and use `def csv(csvDataset: Dataset[String]): DataFrame` for parsing, but joining the resulted dataframe with original one is inconvenient and just slow down execution.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org