[ https://issues.apache.org/jira/browse/SPARK-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468372#comment-15468372 ]
Seth Hendrickson commented on SPARK-17407: ------------------------------------------ [~chriddyp] The file stream source checks for new files by using the file name. Appending new rows to a file that already exists has no effect, by design. We can discuss whether the design ought to change, but as far as I can see nothing is "wrong" here. If you want to update a streaming csv dataframe, then just add new files. > Unable to update structured stream from CSV > ------------------------------------------- > > Key: SPARK-17407 > URL: https://issues.apache.org/jira/browse/SPARK-17407 > Project: Spark > Issue Type: Question > Components: PySpark > Affects Versions: 2.0.0 > Environment: Mac OSX > Spark 2.0.0 > Reporter: Chris Parmer > Priority: Trivial > Labels: beginner, newbie > Original Estimate: 1h > Remaining Estimate: 1h > > I am creating a simple example of a Structured Stream from a CSV file with an > in-memory output stream. > When I add rows the CSV file, my output stream does not update with the new > data. From this example: > https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/4012078893478893/3202384642551446/5985939988045659/latest.html, > I expected that subsequent queries on the same output stream would contain > updated results. > Here is a reproducable code example: https://plot.ly/~chris/17703 > Thanks for the help here! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org