Re: event log directory(spark-history) filled by large .inprogress files for spark streaming applications

2019-07-17 Thread Shahid K. I.
Hi, With the current design, eventlogs are not ideal for long running streaming applications. So, it is better then to disable the eventlogs. There was a proposal for splitting the eventlogs based on size/Job/query for long running applications, not sure about the followup for that. Regards, Sha

Re: event log directory(spark-history) filled by large .inprogress files for spark streaming applications

2019-07-17 Thread Artur Sukhenko
Hi. There is a workaround for that. You can disable event logs for Spark Streaming applications. On Tue, Jul 16, 2019 at 1:08 PM raman gugnani wrote: > HI , > > I have long running spark streaming jobs. > Event log directories are getting filled with .inprogress files. > Is th

event log directory(spark-history) filled by large .inprogress files for spark streaming applications

2019-07-16 Thread raman gugnani
HI , I have long running spark streaming jobs. Event log directories are getting filled with .inprogress files. Is there fix or work around for spark streaming. There is also one jira raised for the same by one reporter. https://issues.apache.org/jira/browse/SPARK-22783 -- Raman Gugnani 85888

Re: share datasets across multiple spark-streaming applications for lookup

2017-11-02 Thread JG Perrin
t 7:53 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: share datasets across multiple spark-streaming applications for lookup Hi, What is the recommended way to share datasets across multiple spark-streaming applications, so t

Re: share datasets across multiple spark-streaming applications for lookup

2017-10-31 Thread Joseph Pride
30, 2017 at 7:53 PM >> To: "user@spark.apache.org" >> Subject: share datasets across multiple spark-streaming applications for >> lookup >> >> >> >> Hi, >> >> >> >> What is the recommended way to share datas

Re: share datasets across multiple spark-streaming applications for lookup

2017-10-31 Thread Gene Pang
th multiple Apps doing lookups simultaneously? Are there better > options? Thank you. > > > > *From: *roshan joe > *Date: *Monday, October 30, 2017 at 7:53 PM > *To: *"user@spark.apache.org" > *Subject: *share datasets across multiple spark-streaming applications >

Re: share datasets across multiple spark-streaming applications for lookup

2017-10-31 Thread Revin Chalil
;user@spark.apache.org" Subject: share datasets across multiple spark-streaming applications for lookup Hi, What is the recommended way to share datasets across multiple spark-streaming applications, so that the incoming data can be looked up against this shared dataset? The shared datas

share datasets across multiple spark-streaming applications for lookup

2017-10-30 Thread roshan joe
Hi, What is the recommended way to share datasets across multiple spark-streaming applications, so that the incoming data can be looked up against this shared dataset? The shared dataset is also incrementally refreshed and stored on S3. Below is the scenario. Streaming App-1 consumes data from

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-24 Thread Dirceu Semighini Filho
Thursday 15 December 2016, Divya Gehlot >> wrote: >> >>> It depends on the use case ... >>> Spark always depends on the resource availability . >>> As long as you have resource to acoomodate ,can run as many spark/spark >>> streaming application. >

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-24 Thread shyla deshpande
;> >> >> Thanks, >> Divya >> >> On 15 December 2016 at 08:42, shyla deshpande >> wrote: >> >>> How many Spark streaming applications can be run at a time on a Spark >>> cluster? >>> >>> Is it better to have 1 spar

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Akhilesh Pathodia
ark/spark > streaming application. > > > Thanks, > Divya > > On 15 December 2016 at 08:42, shyla deshpande > wrote: > >> How many Spark streaming applications can be run at a time on a Spark >> cluster? >> >> Is it better to have 1 spark stre

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Divya Gehlot
It depends on the use case ... Spark always depends on the resource availability . As long as you have resource to acoomodate ,can run as many spark/spark streaming application. Thanks, Divya On 15 December 2016 at 08:42, shyla deshpande wrote: > How many Spark streaming applications can

How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread shyla deshpande
How many Spark streaming applications can be run at a time on a Spark cluster? Is it better to have 1 spark streaming application to consume all the Kafka topics or have multiple streaming applications when possible to keep it simple? Thanks

How to resolve Scheduling delay in Spark streaming applications?

2016-05-10 Thread Hemalatha A
Hello, We are facing large Scheduling delay in our Spark streaming application. Not sure how to debug why the delay is happening. We have all the tuning possible on Spark side. Can someone advice how to debug the cause of the delay and some tips for resolving it please? -- Regards Hemalatha

Durablility of Spark Streaming Applications

2015-01-22 Thread Wang, Daniel
Deployed Spark Streaming applications to a standalone cluster, after a cluster restart, all the deployed applications are gone and I could not see any applications through the Spark Web UI. How to make the Spark Streaming applications durable and auto-restart after a cluster restart

Re: Spark Streaming Applications

2014-10-28 Thread sivarani
th spark context, but how to run it from code automatically when a condition comes true without actually using spark-submit Is it possible? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Applications-tp16976p17453.html Sent from the Apache

Re: Spark Streaming Applications

2014-10-27 Thread Akhil
spork-streaming/blob/master/src/org/apache/pig/backend/hadoop/executionengine/spark_streaming/SparkStreamingLauncher.java#L183> and all, this is a pretty big project actually. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Applications-

Re: Spark Streaming Applications

2014-10-23 Thread Tathagata Das
Cc'ing Helena for more information on this. TD On Thu, Oct 23, 2014 at 6:30 AM, Saiph Kappa wrote: > What is the application about? I couldn't find any proper description > regarding the purpose of killrweather ( I mean, other than just integrating > Spark with Cassandra). Do you know if the sl

Re: Spark Streaming Applications

2014-10-23 Thread Saiph Kappa
What is the application about? I couldn't find any proper description regarding the purpose of killrweather ( I mean, other than just integrating Spark with Cassandra). Do you know if the slides of that tutorial are available somewhere? Thanks! On Wed, Oct 22, 2014 at 6:58 PM, Sameer Farooqui wr

Re: Spark Streaming Applications

2014-10-22 Thread Sameer Farooqui
Hi Saiph, Patrick McFadin and Helena Edelson from DataStax taught a tutorial at NYC Strata last week where they created a prototype Spark Streaming + Kafka application for time series data. You can see the code here: https://github.com/killrweather/killrweather On Tue, Oct 21, 2014 at 4:33 PM,

Spark Streaming Applications

2014-10-21 Thread Saiph Kappa
Hi, I have been trying to find a fairly complex application that makes use of the Spark Streaming framework. I checked public github repos but the examples I found were too simple, only comprising simple operations like counters and sums. On the Spark summit website, I could find very interesting

Re: any code examples demonstrating spark streaming applications which depend on states?

2014-10-02 Thread Chia-Chun Shih
t you seek is what happens "out of the box" (unless I'm > misunderstanding the question) > > On Wed, Oct 1, 2014 at 4:13 AM, Chia-Chun Shih > wrote: > >> Hi, >> >> Are there any code examples demonstrating spark streaming applications >> which depend on states? That is, last-run *updateStateByKey* results are >> used as inputs. >> >> Thanks. >> >> >> >> >> >> >

Re: any code examples demonstrating spark streaming applications which depend on states?

2014-10-01 Thread Yana Kadiyska
13 AM, Chia-Chun Shih wrote: > Hi, > > Are there any code examples demonstrating spark streaming applications > which depend on states? That is, last-run *updateStateByKey* results are > used as inputs. > > Thanks. > > > > > >

any code examples demonstrating spark streaming applications which depend on states?

2014-10-01 Thread Chia-Chun Shih
Hi, Are there any code examples demonstrating spark streaming applications which depend on states? That is, last-run *updateStateByKey* results are used as inputs. Thanks.

Re: Simple Question: Spark Streaming Applications

2014-09-30 Thread Tathagata Das
More importantly, why are you asking this question? :) Also let me generalize the answer by saying that most applications that do some useful computations use map-like operations. And by map-like operations I mean simple operations like map, filter, flatMap, mapPartitions. The only category of appl

Re: Simple Question: Spark Streaming Applications

2014-09-30 Thread Tobias Pfeiffer
Hi, On Wed, Oct 1, 2014 at 12:20 AM, Saiph Kappa wrote: > But most applications use transformations, and map in particular, correct? > Yes, I would claim that most applications that do some useful computation use map(). Tobias

Re: Simple Question: Spark Streaming Applications

2014-09-30 Thread Saiph Kappa
. > > Thanks, > Liquan > > On Mon, Sep 29, 2014 at 10:15 AM, Saiph Kappa > wrote: > >> Hi, >> >> Do all spark streaming applications use the map operation? or the >> majority of them? >> >> Thanks. >> > > > > -- > Liquan Pei > Department of Physics > University of Massachusetts Amherst >

Re: Simple Question: Spark Streaming Applications

2014-09-29 Thread Liquan Pei
Hi Saiph, Map is used for transformation on your input RDD. If you don't need transformation of your input, you don't need to use map. Thanks, Liquan On Mon, Sep 29, 2014 at 10:15 AM, Saiph Kappa wrote: > Hi, > > Do all spark streaming applications use the map operation? o

Simple Question: Spark Streaming Applications

2014-09-29 Thread Saiph Kappa
Hi, Do all spark streaming applications use the map operation? or the majority of them? Thanks.

Re: Multi-tenancy for Spark (Streaming) Applications

2014-09-11 Thread Tobias Pfeiffer
Hi, by now I understood maybe a bit better how spark-submit and YARN play together and how Spark driver and slaves play together on YARN. Now for my usecase, as described on < https://spark.apache.org/docs/latest/submitting-applications.html>, I would probably have a end-user-facing gateway that

Re: Multi-tenancy for Spark (Streaming) Applications

2014-09-08 Thread Tobias Pfeiffer
Hi, On Thu, Sep 4, 2014 at 10:33 AM, Tathagata Das wrote: > In the current state of Spark Streaming, creating separate Java processes > each having a streaming context is probably the best approach to > dynamically adding and removing of input sources. All of these should be > able to to use a Y

Re: Multi-tenancy for Spark (Streaming) Applications

2014-09-03 Thread Tathagata Das
In the current state of Spark Streaming, creating separate Java processes each having a streaming context is probably the best approach to dynamically adding and removing of input sources. All of these should be able to to use a YARN cluster for resource allocation. On Wed, Sep 3, 2014 at 6:30 PM

Multi-tenancy for Spark (Streaming) Applications

2014-09-03 Thread Tobias Pfeiffer
Hi, I am not sure if "multi-tenancy" is the right word, but I am thinking about a Spark application where multiple users can, say, log into some web interface and specify a data processing pipeline with streaming source, processing steps, and output. Now as far as I know, there can be only one St