Hi,
With the current design, eventlogs are not ideal for long running streaming
applications. So, it is better then to disable the eventlogs. There was a
proposal for splitting the eventlogs based on size/Job/query for long
running applications, not sure about the followup for that.
Regards,
Hi.
There is a workaround for that.
You can disable event logs for Spark Streaming applications.
On Tue, Jul 16, 2019 at 1:08 PM raman gugnani
wrote:
> HI ,
>
> I have long running spark streaming jobs.
> Event log directories are getting filled with .inprogress files.
> Is th
HI ,
I have long running spark streaming jobs.
Event log directories are getting filled with .inprogress files.
Is there fix or work around for spark streaming.
There is also one jira raised for the same by one reporter.
https://issues.apache.org/jira/browse/SPARK-22783
--
Raman Gugnani
nk you.
From: roshan joe <impdocs2...@gmail.com<mailto:impdocs2...@gmail.com>>
Date: Monday, October 30, 2017 at 7:53 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>"
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: share datasets across mul
>>
>> From: roshan joe <impdocs2...@gmail.com>
>> Date: Monday, October 30, 2017 at 7:53 PM
>> To: "user@spark.apache.org" <user@spark.apache.org>
>> Subject: share datasets across multiple spark-streaming applications for
>> lookup
>>
&g
Do they
> work well with multiple Apps doing lookups simultaneously? Are there better
> options? Thank you.
>
>
>
> *From: *roshan joe <impdocs2...@gmail.com>
> *Date: *Monday, October 30, 2017 at 7:53 PM
> *To: *"user@spark.apache.org" <user@spark.apache.o
7 at 7:53 PM
To: "user@spark.apache.org" <user@spark.apache.org>
Subject: share datasets across multiple spark-streaming applications for lookup
Hi,
What is the recommended way to share datasets across multiple spark-streaming
applications, so that the incoming data can be looked
Hi,
What is the recommended way to share datasets across multiple
spark-streaming applications, so that the incoming data can be looked up
against this shared dataset?
The shared dataset is also incrementally refreshed and stored on S3. Below
is the scenario.
Streaming App-1 consumes data from
ate ,can run as many spark/spark
>>> streaming application.
>>>
>>>
>>> Thanks,
>>> Divya
>>>
>>> On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com>
>>> wrote:
>>>
>>>> How many Spark streaming applications can be run at a time on a Spark
>>>> cluster?
>>>>
>>>> Is it better to have 1 spark streaming application to consume all the
>>>> Kafka topics or have multiple streaming applications when possible to keep
>>>> it simple?
>>>>
>>>> Thanks
>>>>
>>>>
>>>
>
gt;> streaming application.
>>
>>
>> Thanks,
>> Divya
>>
>> On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com>
>> wrote:
>>
>>> How many Spark streaming applications can be run at a time on a Spark
>>> c
acoomodate ,can run as many spark/spark
> streaming application.
>
>
> Thanks,
> Divya
>
> On 15 December 2016 at 08:42, shyla deshpande <deshpandesh...@gmail.com
> <javascript:_e(%7B%7D,'cvml','deshpandesh...@gmail.com');>> wrote:
>
>> How many
any Spark streaming applications can be run at a time on a Spark
> cluster?
>
> Is it better to have 1 spark streaming application to consume all the
> Kafka topics or have multiple streaming applications when possible to keep
> it simple?
>
> Thanks
>
>
How many Spark streaming applications can be run at a time on a Spark
cluster?
Is it better to have 1 spark streaming application to consume all the Kafka
topics or have multiple streaming applications when possible to keep it
simple?
Thanks
Hello,
We are facing large Scheduling delay in our Spark streaming application.
Not sure how to debug why the delay is happening. We have all the tuning
possible on Spark side.
Can someone advice how to debug the cause of the delay and some tips for
resolving it please?
--
Regards
Deployed Spark Streaming applications to a standalone cluster, after a cluster
restart, all the deployed applications are gone and I could not see any
applications through the Spark Web UI.
How to make the Spark Streaming applications durable and auto-restart after a
cluster restart
/master/src/org/apache/pig/backend/hadoop/executionengine/spark_streaming/SparkStreamingLauncher.java#L183
and all, this is a pretty big project actually.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Applications-tp16976p17440.html
Sent
automatically when a condition comes true without actually
using spark-submit
Is it possible?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Applications-tp16976p17453.html
Sent from the Apache Spark User List mailing list archive
What is the application about? I couldn't find any proper description
regarding the purpose of killrweather ( I mean, other than just integrating
Spark with Cassandra). Do you know if the slides of that tutorial are
available somewhere?
Thanks!
On Wed, Oct 22, 2014 at 6:58 PM, Sameer Farooqui
Cc'ing Helena for more information on this.
TD
On Thu, Oct 23, 2014 at 6:30 AM, Saiph Kappa saiph.ka...@gmail.com wrote:
What is the application about? I couldn't find any proper description
regarding the purpose of killrweather ( I mean, other than just integrating
Spark with Cassandra). Do
Hi Saiph,
Patrick McFadin and Helena Edelson from DataStax taught a tutorial at NYC
Strata last week where they created a prototype Spark Streaming + Kafka
application for time series data.
You can see the code here:
https://github.com/killrweather/killrweather
On Tue, Oct 21, 2014 at 4:33 PM,
Hi,
I have been trying to find a fairly complex application that makes use of
the Spark Streaming framework. I checked public github repos but the
examples I found were too simple, only comprising simple operations like
counters and sums. On the Spark summit website, I could find very
interesting
the question)
On Wed, Oct 1, 2014 at 4:13 AM, Chia-Chun Shih chiachun.s...@gmail.com
wrote:
Hi,
Are there any code examples demonstrating spark streaming applications
which depend on states? That is, last-run *updateStateByKey* results are
used as inputs.
Thanks.
Hi,
Are there any code examples demonstrating spark streaming applications
which depend on states? That is, last-run *updateStateByKey* results are
used as inputs.
Thanks.
Hi,
Do all spark streaming applications use the map operation? or the majority
of them?
Thanks.
Hi Saiph,
Map is used for transformation on your input RDD. If you don't need
transformation of your input, you don't need to use map.
Thanks,
Liquan
On Mon, Sep 29, 2014 at 10:15 AM, Saiph Kappa saiph.ka...@gmail.com wrote:
Hi,
Do all spark streaming applications use the map operation
Hi,
by now I understood maybe a bit better how spark-submit and YARN play
together and how Spark driver and slaves play together on YARN.
Now for my usecase, as described on
https://spark.apache.org/docs/latest/submitting-applications.html, I would
probably have a end-user-facing gateway that
Hi,
On Thu, Sep 4, 2014 at 10:33 AM, Tathagata Das tathagata.das1...@gmail.com
wrote:
In the current state of Spark Streaming, creating separate Java processes
each having a streaming context is probably the best approach to
dynamically adding and removing of input sources. All of these
Hi,
I am not sure if multi-tenancy is the right word, but I am thinking about
a Spark application where multiple users can, say, log into some web
interface and specify a data processing pipeline with streaming source,
processing steps, and output.
Now as far as I know, there can be only one
In the current state of Spark Streaming, creating separate Java processes
each having a streaming context is probably the best approach to
dynamically adding and removing of input sources. All of these should be
able to to use a YARN cluster for resource allocation.
On Wed, Sep 3, 2014 at 6:30
29 matches
Mail list logo