Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21477#discussion_r193935386 --- Diff: python/pyspark/sql/streaming.py --- @@ -843,6 +844,169 @@ def trigger(self, processingTime=None, once=None, continuous=None): self._jwrite = self._jwrite.trigger(jTrigger) return self + def foreach(self, f): + """ + Sets the output of the streaming query to be processed using the provided writer ``f``. + This is often used to write the output of a streaming query to arbitrary storage systems. + The processing logic can be specified in two ways. + + #. A **function** that takes a row as input. --- End diff -- I believe `$"columnName"` is more like a language specific feature in Scala and I think `df.columnName` is language specific to Python. > And ultimately convenience is what matters for the user experience. Thing is, it sounded to me like we are kind of prejudging it.. > I think we should also add the lambda variant to Scala as well. +1 I am okay but I hope this shouldn't be usually done next time ..
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org