Re: event log directory(spark-history) filled by large .inprogress files for spark streaming applications

2019-07-17 Thread Shahid K. I.
Hi, With the current design, eventlogs are not ideal for long running streaming applications. So, it is better then to disable the eventlogs. There was a proposal for splitting the eventlogs based on size/Job/query for long running applications, not sure about the followup for that. Regards,

Re: event log directory(spark-history) filled by large .inprogress files for spark streaming applications

2019-07-17 Thread Artur Sukhenko
Hi. There is a workaround for that. You can disable event logs for Spark Streaming applications. On Tue, Jul 16, 2019 at 1:08 PM raman gugnani wrote: > HI , > > I have long running spark streaming jobs. > Event log directories are getting filled with .inprogress files. > Is th

event log directory(spark-history) filled by large .inprogress files for spark streaming applications

2019-07-16 Thread raman gugnani
HI , I have long running spark streaming jobs. Event log directories are getting filled with .inprogress files. Is there fix or work around for spark streaming. There is also one jira raised for the same by one reporter. https://issues.apache.org/jira/browse/SPARK-22783 -- Raman Gugnani

Re: share datasets across multiple spark-streaming applications for lookup

2017-11-02 Thread JG Perrin
nk you. From: roshan joe <impdocs2...@gmail.com<mailto:impdocs2...@gmail.com>> Date: Monday, October 30, 2017 at 7:53 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: share datasets across mul

Re: share datasets across multiple spark-streaming applications for lookup

2017-10-31 Thread Joseph Pride
>> >> From: roshan joe <impdocs2...@gmail.com> >> Date: Monday, October 30, 2017 at 7:53 PM >> To: "user@spark.apache.org" <user@spark.apache.org> >> Subject: share datasets across multiple spark-streaming applications for >> lookup >> &g

Re: share datasets across multiple spark-streaming applications for lookup

2017-10-31 Thread Gene Pang
Do they > work well with multiple Apps doing lookups simultaneously? Are there better > options? Thank you. > > > > *From: *roshan joe <impdocs2...@gmail.com> > *Date: *Monday, October 30, 2017 at 7:53 PM > *To: *"user@spark.apache.org" <user@spark.apache.o

Re: share datasets across multiple spark-streaming applications for lookup

2017-10-31 Thread Revin Chalil
7 at 7:53 PM To: "user@spark.apache.org" <user@spark.apache.org> Subject: share datasets across multiple spark-streaming applications for lookup Hi, What is the recommended way to share datasets across multiple spark-streaming applications, so that the incoming data can be looked

share datasets across multiple spark-streaming applications for lookup

2017-10-30 Thread roshan joe
Hi, What is the recommended way to share datasets across multiple spark-streaming applications, so that the incoming data can be looked up against this shared dataset? The shared dataset is also incrementally refreshed and stored on S3. Below is the scenario. Streaming App-1 consumes data from

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-24 Thread Dirceu Semighini Filho
ate ,can run as many spark/spark >>> streaming application. >>> >>> >>> Thanks, >>> Divya >>> >>> On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com> >>> wrote: >>> >>>> How many Spark streaming applications can be run at a time on a Spark >>>> cluster? >>>> >>>> Is it better to have 1 spark streaming application to consume all the >>>> Kafka topics or have multiple streaming applications when possible to keep >>>> it simple? >>>> >>>> Thanks >>>> >>>> >>> >

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-24 Thread shyla deshpande
gt;> streaming application. >> >> >> Thanks, >> Divya >> >> On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com> >> wrote: >> >>> How many Spark streaming applications can be run at a time on a Spark >>> c

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Akhilesh Pathodia
acoomodate ,can run as many spark/spark > streaming application. > > > Thanks, > Divya > > On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com > <javascript:_e(%7B%7D,'cvml','deshpandesh...@gmail.com');>> wrote: > >> How many

Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Divya Gehlot
any Spark streaming applications can be run at a time on a Spark > cluster? > > Is it better to have 1 spark streaming application to consume all the > Kafka topics or have multiple streaming applications when possible to keep > it simple? > > Thanks > >

How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread shyla deshpande
How many Spark streaming applications can be run at a time on a Spark cluster? Is it better to have 1 spark streaming application to consume all the Kafka topics or have multiple streaming applications when possible to keep it simple? Thanks

How to resolve Scheduling delay in Spark streaming applications?

2016-05-10 Thread Hemalatha A
Hello, We are facing large Scheduling delay in our Spark streaming application. Not sure how to debug why the delay is happening. We have all the tuning possible on Spark side. Can someone advice how to debug the cause of the delay and some tips for resolving it please? -- Regards

Durablility of Spark Streaming Applications

2015-01-22 Thread Wang, Daniel
Deployed Spark Streaming applications to a standalone cluster, after a cluster restart, all the deployed applications are gone and I could not see any applications through the Spark Web UI. How to make the Spark Streaming applications durable and auto-restart after a cluster restart

Re: Spark Streaming Applications

2014-10-28 Thread Akhil
/master/src/org/apache/pig/backend/hadoop/executionengine/spark_streaming/SparkStreamingLauncher.java#L183 and all, this is a pretty big project actually. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Applications-tp16976p17440.html Sent

Re: Spark Streaming Applications

2014-10-28 Thread sivarani
automatically when a condition comes true without actually using spark-submit Is it possible? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Applications-tp16976p17453.html Sent from the Apache Spark User List mailing list archive

Re: Spark Streaming Applications

2014-10-23 Thread Saiph Kappa
What is the application about? I couldn't find any proper description regarding the purpose of killrweather ( I mean, other than just integrating Spark with Cassandra). Do you know if the slides of that tutorial are available somewhere? Thanks! On Wed, Oct 22, 2014 at 6:58 PM, Sameer Farooqui

Re: Spark Streaming Applications

2014-10-23 Thread Tathagata Das
Cc'ing Helena for more information on this. TD On Thu, Oct 23, 2014 at 6:30 AM, Saiph Kappa saiph.ka...@gmail.com wrote: What is the application about? I couldn't find any proper description regarding the purpose of killrweather ( I mean, other than just integrating Spark with Cassandra). Do

Re: Spark Streaming Applications

2014-10-22 Thread Sameer Farooqui
Hi Saiph, Patrick McFadin and Helena Edelson from DataStax taught a tutorial at NYC Strata last week where they created a prototype Spark Streaming + Kafka application for time series data. You can see the code here: https://github.com/killrweather/killrweather On Tue, Oct 21, 2014 at 4:33 PM,

Spark Streaming Applications

2014-10-21 Thread Saiph Kappa
Hi, I have been trying to find a fairly complex application that makes use of the Spark Streaming framework. I checked public github repos but the examples I found were too simple, only comprising simple operations like counters and sums. On the Spark summit website, I could find very interesting

Re: any code examples demonstrating spark streaming applications which depend on states?

2014-10-02 Thread Chia-Chun Shih
the question) On Wed, Oct 1, 2014 at 4:13 AM, Chia-Chun Shih chiachun.s...@gmail.com wrote: Hi, Are there any code examples demonstrating spark streaming applications which depend on states? That is, last-run *updateStateByKey* results are used as inputs. Thanks.

any code examples demonstrating spark streaming applications which depend on states?

2014-10-01 Thread Chia-Chun Shih
Hi, Are there any code examples demonstrating spark streaming applications which depend on states? That is, last-run *updateStateByKey* results are used as inputs. Thanks.

Simple Question: Spark Streaming Applications

2014-09-29 Thread Saiph Kappa
Hi, Do all spark streaming applications use the map operation? or the majority of them? Thanks.

Re: Simple Question: Spark Streaming Applications

2014-09-29 Thread Liquan Pei
Hi Saiph, Map is used for transformation on your input RDD. If you don't need transformation of your input, you don't need to use map. Thanks, Liquan On Mon, Sep 29, 2014 at 10:15 AM, Saiph Kappa saiph.ka...@gmail.com wrote: Hi, Do all spark streaming applications use the map operation

Re: Multi-tenancy for Spark (Streaming) Applications

2014-09-11 Thread Tobias Pfeiffer
Hi, by now I understood maybe a bit better how spark-submit and YARN play together and how Spark driver and slaves play together on YARN. Now for my usecase, as described on https://spark.apache.org/docs/latest/submitting-applications.html, I would probably have a end-user-facing gateway that

Re: Multi-tenancy for Spark (Streaming) Applications

2014-09-08 Thread Tobias Pfeiffer
Hi, On Thu, Sep 4, 2014 at 10:33 AM, Tathagata Das tathagata.das1...@gmail.com wrote: In the current state of Spark Streaming, creating separate Java processes each having a streaming context is probably the best approach to dynamically adding and removing of input sources. All of these

Multi-tenancy for Spark (Streaming) Applications

2014-09-03 Thread Tobias Pfeiffer
Hi, I am not sure if multi-tenancy is the right word, but I am thinking about a Spark application where multiple users can, say, log into some web interface and specify a data processing pipeline with streaming source, processing steps, and output. Now as far as I know, there can be only one

Re: Multi-tenancy for Spark (Streaming) Applications

2014-09-03 Thread Tathagata Das
In the current state of Spark Streaming, creating separate Java processes each having a streaming context is probably the best approach to dynamically adding and removing of input sources. All of these should be able to to use a YARN cluster for resource allocation. On Wed, Sep 3, 2014 at 6:30