Re: Lateness for Spark

2018-09-12 Thread Amit Sela
event time triggers (via watermarks) should be supported as well. On Sun, Sep 9, 2018 at 11:57 PM Vishwas Bm wrote: > Hi, > > Thanks for the reply. As per the beam capability matrix only > Processing-time triggers is supported by spark runner. > As this page is not updated, what other triggers

Re: Lateness for Spark

2018-09-09 Thread Amit Sela
I don't think the capability matrix is updated, the Spark runner uses LateDataUtils to handle late elements - https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L300 On Fri, Sep 7, 2018 at 6:43 PM Rag

Re: automatic runner inference

2017-04-05 Thread Amit Sela
I agree that the Spark runner should support submitting programatically to cluster, though not instead of "spark-submit" but in addition, so that Beam users enjoy a good experience of portability while Spark users enjoy a quick ramp-up into Beam. You can follow: https://issues.apache.org/jira/brow

Re: Slack

2017-03-13 Thread Amit Sela
I'm so well trained, I do it on my phone now! On Mon, Mar 13, 2017, 15:24 Tobias Feldhaus wrote: > Same for me please :) > > Tobi > > > > On 13.03.17, 13:30, "Amit Sela" wrote: > > > > Done. Welcome! > > > > On Mon, Mar 13, 2017 a

Re: Slack

2017-03-13 Thread Amit Sela
Done. Welcome! On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego wrote: > same for me please. > > > > > .alex > > > > On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela wrote: > > Done > > On Fri, Mar 10, 2017, 21:59 Devon Meunier > wrote: > > Hi!

Re: Regarding Beam Slack Channel

2017-03-11 Thread Amit Sela
Done. Welcome! On Sat, Mar 11, 2017, 15:33 Piyush Muthal wrote: > Hello > > Can someone please add me to the Beam slack channel? > > Thanks. >

Re: Slack

2017-03-10 Thread Amit Sela
Done On Fri, Mar 10, 2017, 21:59 Devon Meunier wrote: > Hi! > > Sorry for the noise but could someone invite me to the slack channel? > > Thanks, > > Devon >

Re: Regarding Beam Slack channel

2017-03-10 Thread Amit Sela
Done. Welcome! On Fri, Mar 10, 2017, 12:30 Borisa Zivkovic wrote: > Hi, > > can someone please add me to beam slack channel? > > thanks > Borisa >

Re: Monitoring and Management Tools for Beam an Friends

2017-03-02 Thread Amit Sela
+Stas Levin On Thu, Mar 2, 2017 at 5:30 PM Jean-Baptiste Onofré wrote: Hi Benjamin, It's a bit related to the Metric discussion on the dev@ mailing list. Today, we leverage the monitoring and management provided by the execution engine of the runner. For instance, with the Spark runner, we c

Re: 答复: Is it possible that a feature which the underlying engine (e.g. Spark) supports, but cann't be expressed using Beam API?

2017-02-23 Thread Amit Sela
If I understand, the use case concerning stop() and getState() in PipelineRunner is how to interact with a running pipeline - query it's state and (potentially) stop it - right ? If so, this API is available via PipelineResult

Re: collect to local

2017-02-20 Thread Amit Sela
-reduce > tool-set, it would just be way easier implementing it locally. > > Antony. > On Monday, 20 February 2017, 10:53, Amit Sela > wrote: > > > Hi Antony, > > Generally, PCollections are a distributed bag of elements, just like Spark > RDDs (for batch). > Assu

Re: collect to local

2017-02-20 Thread Amit Sela
Spark runner's EvaluationContext has a hook ready for this - but clearly only for batch, in streaming this feature doesn't seem relevant. You can easily ha

Re: collect to local

2017-02-20 Thread Amit Sela
Hi Antony, Generally, PCollections are a distributed bag of elements, just like Spark RDDs (for batch). Assuming you have a distributed collection, you probably wouldn't want to materialize it locally, and even if it's a global count (result) of some kind (guaranteeing to avoid OOM in your "driver

Setting synchronized processing time triggers

2017-02-19 Thread Amit Sela
Hi all, I was wondering how to use AfterSynchronizedProcessingTime trigger. for processing time triggers there's the AfterProcessingTime.pastFirstElementInPane() API but I've noticed that AfterSynchronizedProcessingTime constructor is only called from a a continuation trigger. I wonder if someone

Re: possible reasons for exception "Cannot move input watermark time backwards from"

2017-02-06 Thread Amit Sela
.com] > *Gesendet:* Montag, 6. Februar 2017 12:19 > *An:* user@beam.apache.org > *Betreff:* AW: possible reasons for exception "Cannot move input > watermark time backwards from" > > > > See below > > > > *Von:* Amit Sela [mailto:amitsel...@gmail.com ] >

Re: possible reasons for exception "Cannot move input watermark time backwards from"

2017-02-05 Thread Amit Sela
@Rico: filed BEAM-1395 <https://issues.apache.org/jira/browse/BEAM-1395>. This should be sorted soon enough. Thanks for reporting this issue! On Fri, Feb 3, 2017 at 4:31 PM Amit Sela wrote: > OK, this is indeed a different stacktrace - the problem now is in > SparkGroupAlsoByWInd

Re: possible reasons for exception "Cannot move input watermark time backwards from"

2017-02-03 Thread Amit Sela
rReadCheckpoint(RDD.scala:306) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41

Re: possible reasons for exception "Cannot move input watermark time backwards from"

2017-02-03 Thread Amit Sela
ested I tried it with the current beam0.5.0-SNAPSHOT. But ran > into the same error … L > > > > Any further ideas or suggestions? > > > > Best, > > Rico. > > > > *Von:* Amit Sela [mailto:amitsel...@gmail.com] > *Gesendet:* Donnerstag, 2. Februar 2017 1

Re: possible reasons for exception "Cannot move input watermark time backwards from"

2017-02-02 Thread Amit Sela
Hi Rico, Batch sort of uses Watermarks by noting the "start watermark" at the beginning of time, and the "end watermark" at the end of time (this is the "overflow" you see), or in a more "Beam" way, the watermark at the beginning is the start of time, and after processing all the elements the wat

Re: Beam Spark/Flink runner with DC/OS

2017-02-01 Thread Amit Sela
. I’m using the > code described here https://beam.apache.org/get-started/wordcount-example/ > So the file already exists in GS. > > On Jan 23, 2017, at 4:55 PM, Chaoran Yu wrote: > > I didn’t upload the file. But since the identical Beam code, when running > in Spark local m

Re: Beam Spark/Flink runner with DC/OS

2017-01-23 Thread Amit Sela
tarted/quickstart/ says the > following: > "you can’t access a local file if you are running the pipeline on an > external cluster”. > I’m indeed trying to run a pipeline on a standalone Spark cluster running > on my local machine. So local files are not an option. > >

Re: Beam Spark/Flink runner with DC/OS

2017-01-23 Thread Amit Sela
k cluster > with sbin/start-master.sh and sbin/start-slave.sh. Then I submitted my Beam > job to that cluster. > The gs file is the kinglear.txt from Beam’s example code and it should be > public. > > My full stack trace is attached. > > Thanks, > Chaoran > > > > O

Re: Regarding Beam Slack Channel

2017-01-23 Thread Amit Sela
Done. Welcome! On Mon, Jan 23, 2017 at 11:29 PM Kai Jiang wrote: > Hi Amit, > > I am a new contributor. Could you add me as well? jiang...@gmail.com > > Thanks, > Kai > > On Mon, Jan 23, 2017 at 1:04 PM, Davor Bonaci wrote: > > I wouldn't be in favor of such "solutions" on top of Slack, unless

Re: Beam Spark/Flink runner with DC/OS

2017-01-23 Thread Amit Sela
luster > mode (spark-submit) with the error I pasted in the previous email. > > In SparkRunner’s case, can it be that Spark executor can’t access gs file > in Google Storage? > > Thank you, > > > > On Jan 23, 2017, at 3:28 PM, Amit Sela wrote: > > Is this workin

Re: Regarding Beam Slack Channel

2017-01-23 Thread Amit Sela
Done. Welcome! On Mon, Jan 23, 2017 at 10:26 PM Wyatt Frelot wrote: > Can I be added as well? > > wjfr...@gmail.com > > Wyatt > > On Jan 23, 2017 3:20 PM, "Amit Sela" wrote: > > Done. Welcome! > > On Mon, Jan 23, 2017 at 10:17 PM Brackman, Levi <

Re: Beam Spark/Flink runner with DC/OS

2017-01-23 Thread Amit Sela
when running in cluster mode. How do > you guys doing file IO in Beam when using the SparkRunner? > > > Thank you, > Chaoran > > > On Jan 22, 2017, at 4:32 AM, Amit Sela wrote: > > I'lll join JB's comment on the Spark runner saying that submitting Beam >

Re: Regarding Beam Slack Channel

2017-01-23 Thread Amit Sela
Done. Welcome! On Mon, Jan 23, 2017 at 10:10 PM Chaoran Yu wrote: > Can anyone add me to the channel as well? Thank you! > chaoran...@lightbend.com > > Chaoran > > On Jan 23, 2017, at 1:32 AM, Ritesh Kasat wrote: > > Hello, > > Can someone add me to the Beam slack channel. > > Thanks > Ritesh >

Re: Regarding Beam Slack Channel

2017-01-23 Thread Amit Sela
Done. Welcome! On Mon, Jan 23, 2017 at 10:03 PM Xu Mingmin wrote: > can anyone add me to the channel as well? Thanks a lot! > > mmxu1...@gmail.com > > Mingmin > > On Mon, Jan 23, 2017 at 10:46 AM, Jean-Baptiste Onofré > wrote: > > Thanks Jason, > > that's a good alternative if open channel is n

Re: Beam Spark/Flink runner with DC/OS

2017-01-22 Thread Amit Sela
I'lll join JB's comment on the Spark runner saying that submitting Beam pipelines using the Spark runner can be done using Spark's spark-submit script, find out more in the Spark runner documentation . Amit. On Sun, Jan 22, 2017 at 8:03 AM Jea

Re: Streaming job with Kafka on SparkRunner

2017-01-21 Thread Amit Sela
obatch, and keep alive. Please correct me if any misunderstanding. Mingmin On Sat, Jan 21, 2017 at 1:35 AM, Amit Sela wrote: Not sure why this would cause the application to crash, but I can give some background about how the Spark runner reads microbatches from Kafka (and generally UnboundedSour

Re: Streaming job with Kafka on SparkRunner

2017-01-21 Thread Amit Sela
sl.provider = null > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] > ssl.keystore.location = null > ssl.cipher.suites = null > security.protocol = PLAINTEXT > ssl.keymanager.algorithm = SunX509 > metrics.sample.window.ms = 3 > sasl.callback.handler.

Re: Streaming job with Kafka on SparkRunner

2017-01-20 Thread Amit Sela
is there a runnable example for Spark streaming so I can refer to? > > Thanks! > Mingmin > > On Fri, Jan 20, 2017 at 11:45 AM, Amit Sela wrote: > > The WakeupException is being logged and not thrown (it is OK since the > reader was closed due to end-of-microbatch), so I wonde

Re: Streaming job with Kafka on SparkRunner

2017-01-20 Thread Amit Sela
The WakeupException is being logged and not thrown (it is OK since the reader was closed due to end-of-microbatch), so I wonder what causes "ERROR StreamingListenerBus: StreamingListenerBus has already stopped". Are you running in local-mode ("local[*]") ? or over YARN ? Any specific options you'r

Re: KafkaIO Example

2017-01-11 Thread Amit Sela
I assume you're using the Spark runner since ConsoleIO is a Spark runner only transform (mostly for POCs and playground). if so, could you please share the error you see ? Amit. On Wed, Jan 11, 2017 at 11:57 PM Madhire, Naveen < naveen.madh...@capitalone.com> wrote: > Hi, > > > > I am trying to