Jacek, so I create cache in ForeachWriter, in all "process()" I write to it and on close I flush? Something like that?
2017-02-09 12:42 GMT-08:00 Jacek Laskowski <ja...@japila.pl>: > Hi, > > Yes, that's ForeachWriter. > > Yes, it works with element by element. You're looking for mapPartition > and ForeachWriter has partitionId that you could use to implement a > similar thing. > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Thu, Feb 9, 2017 at 3:55 AM, Egor Pahomov <pahomov.e...@gmail.com> > wrote: > > Jacek, you mean > > http://spark.apache.org/docs/latest/api/scala/index.html# > org.apache.spark.sql.ForeachWriter > > ? I do not understand how to use it, since it passes every value > separately, > > not every partition. And addding to table value by value would not work > > > > 2017-02-07 12:10 GMT-08:00 Jacek Laskowski <ja...@japila.pl>: > >> > >> Hi, > >> > >> Have you considered foreach sink? > >> > >> Jacek > >> > >> On 6 Feb 2017 8:39 p.m., "Egor Pahomov" <pahomov.e...@gmail.com> wrote: > >>> > >>> Hi, I'm thinking of using Structured Streaming instead of old > streaming, > >>> but I need to be able to save results to Hive table. Documentation for > file > >>> sink > >>> says(http://spark.apache.org/docs/latest/structured- > streaming-programming-guide.html#output-sinks): > >>> "Supports writes to partitioned tables. ". But being able to write to > >>> partitioned directories is not enough to write to the table: someone > needs > >>> to write to Hive metastore. How can I use Structured Streaming and > write to > >>> Hive table? > >>> > >>> -- > >>> Sincerely yours > >>> Egor Pakhomov > > > > > > > > > > -- > > Sincerely yours > > Egor Pakhomov > -- *Sincerely yoursEgor Pakhomov*