Thank you so much TD, Matt, Anirudh and Oz, Really appropriate this. On Fri, Apr 13, 2018 at 9:54 PM, Oz Ben-Ami <ozzi...@gmail.com> wrote:
> I can confirm that Structured Streaming works on Kubernetes, though we're > not quite on production with that yet. Issues we're looking at are: > - Submission through spark-submit works, but is a bit clunky with a > kubernetes-centered workflow. Spark Operator > <https://github.com/GoogleCloudPlatform/spark-on-k8s-operator> is > promising, but still in alpha (eg, we ran into this > <https://github.com/kubernetes/kubernetes/issues/56018>). Even better > would be something that runs the driver as a Deployment / StatefulSet, so > that long-running streaming jobs can be restarted automatically > - Dynamic allocation: works with the spark-on-k8s fork, but not with plain > Spark 2.3, due to reliance on shuffle service which hasn't been merged yet. > Ideal implementation would be able to connect to a PersistentVolume > independently of a node, but that's a bit more complicated > - Checkpointing: We checkpoint to a separate HDFS (Dataproc) cluster, > which works well for us both on the old Spark Streaming and Structured > Streaming. We've successfully experimented with HDFS on Kubernetes > <https://github.com/apache-spark-on-k8s/kubernetes-HDFS/tree/master>, but > again not in production > - UI: Unfortunately Structured Streaming does not yet have a comprehensive > UI like the old Spark Streaming, but it does show the basic information > (jobs, stages, queries, executors), and other information is generally > available in the logs and metrics > - Monitoring / Logging: this is a strength of Kubernetes, in that it's all > centralized by the cluster. We use Splunk, but it would also be possible to > hook > up <https://github.com/dhatim/dropwizard-prometheus> Spark's Dropwizard > Metrics library to Prometheus, and read logs with fluentd or Stackdriver. > - Side note: Kafka support in Spark and Structured Streaming is very good, > but as of Spark 2.3 there are still a couple of missing features, notably > transparent avro support (UDFs are needed) and taking advantage of > transactional processing (introduced to Kafka last year) for better > exactly-once guarantees > > On Fri, Apr 13, 2018 at 3:08 PM, Anirudh Ramanathan < > ramanath...@google.com> wrote: > >> +ozzieba who was experimenting with streaming workloads recently. +1 to >> what Matt said. Checkpointing and driver recovery is future work. >> Structured streaming is important, and it would be good to get some >> production experiences here and try and target improving the feature's >> support on K8s for 2.4/3.0. >> >> >> On Fri, Apr 13, 2018 at 11:55 AM Matt Cheah <mch...@palantir.com> wrote: >> >>> We don’t provide any Kubernetes-specific mechanisms for streaming, such >>> as checkpointing to persistent volumes. But as long as streaming doesn’t >>> require persisting to the executor’s local disk, streaming ought to work >>> out of the box. E.g. you can checkpoint to HDFS, but not to the pod’s local >>> directories. >>> >>> >>> >>> However, I’m unaware of any specific use of streaming with the Spark on >>> Kubernetes integration right now. Would be curious to get feedback on the >>> failover behavior right now. >>> >>> >>> >>> -Matt Cheah >>> >>> >>> >>> *From: *Tathagata Das <t...@databricks.com> >>> *Date: *Friday, April 13, 2018 at 1:27 AM >>> *To: *Krishna Kalyan <krishnakaly...@gmail.com> >>> *Cc: *user <user@spark.apache.org> >>> *Subject: *Re: Structured Streaming on Kubernetes >>> >>> >>> >>> Structured streaming is stable in production! At Databricks, we and our >>> customers collectively process almost 100s of billions of records per day >>> using SS. However, we are not using kubernetes :) >>> >>> >>> >>> Though I don't think it will matter too much as long as kubes are >>> correctly provisioned+configured and you are checkpointing to HDFS (for >>> fault-tolerance guarantees). >>> >>> >>> >>> TD >>> >>> >>> >>> On Fri, Apr 13, 2018, 12:28 AM Krishna Kalyan <krishnakaly...@gmail.com> >>> wrote: >>> >>> Hello All, >>> >>> We were evaluating Spark Structured Streaming on Kubernetes (Running on >>> GCP). It would be awesome if the spark community could share their >>> experience around this. I would like to know more about you production >>> experience and the monitoring tools you are using. >>> >>> >>> >>> Since spark on kubernetes is a relatively new addition to spark, I was >>> wondering if structured streaming is stable in production. We were also >>> evaluating Apache Beam with Flink. >>> >>> >>> >>> Regards, >>> >>> Krishna >>> >>> >>> >>> >>> >>> >> >> -- >> Anirudh Ramanathan >> > >