Hi,
I have the following use case and I did not find a suitable tool which can
serve my purpose.
Use case:
Step 1,2,3 are UI driven.
*Step 1*) A user should be able to choose data source (example HDFS) and
should be able to configure it so that it points to a file.
*Step 2*) A user should be
Hi,
I am looking for applications where we can trigger spark jobs from UI.
Are there any such applications available?
I have checked Spark-jobserver using which we can expose an api to submit a
spark application.
Are there any other alternatives using which i can submit pyspark jobs from
UI ?
Other than @Adrian suggestions, check if the processing delay is more than
the batch processing time.
On Thu, Oct 29, 2015 at 2:23 AM, Adrian Tanase wrote:
> Does it work as expected with smaller batch or smaller load? Could it be
> that it's accumulating too many events over
I create a stream from kafka as belows"
val kafkaDStream =
KafkaUtils.createDirectStream[String,KafkaGenericEvent,StringDecoder,KafkaGenericEventsDecoder](ssc,
kafkaConf, Set(topics))
.window(Minutes(WINDOW_DURATION),Minutes(SLIDER_DURATION))
I have a map ("intToStringList") which is a
<t...@databricks.com> wrote:
> Are you getting this error in local mode?
>
>
> On Tue, Sep 22, 2015 at 7:34 AM, srungarapu vamsi <
> srungarapu1...@gmail.com> wrote:
>
>> Yes, I tried ssc.checkpoint("checkpoint"), it works for me as lon
ocal
> mode.
>
> For the others (/tmp/..) make sure you have rights to write there.
>
> -adrian
>
> From: srungarapu vamsi
> Date: Tuesday, September 22, 2015 at 7:59 AM
> To: user
> Subject: Invalid checkpoint url
>
> I am using reduceByKeyAndWindow (with inverse re
I am using reduceByKeyAndWindow (with inverse reduce function) in my code.
In order to use this, it seems the checkpointDirectory which i have to use
should be hadoop compatible file system.
Does that mean that, i should setup hadoop on my system.
I googled about this and i found in a S.O answer
data to be collected on the
> driver (assuming you don’t want that…)
>
> val events = kafkaDStream.map { case(devId,byteArray)=>
> KafkaGenericEvent.parseFrom(byteArray) }
>
> From: srungarapu vamsi
> Date: Thursday, September 17, 2015 at 4:03 PM
> To: user
> Subject: Spa
I am using KafkaUtils.createDirectStream to read the data from kafka bus.
On the producer end, i am generating in the following way:
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers)
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
details of how you do deserialization.
>
> Thanks
> Saisai
>
> On Thu, Sep 17, 2015 at 9:49 AM, srungarapu vamsi <
> srungarapu1...@gmail.com> wrote:
>
>> If i understand correctly, i guess you are suggesting me to do this :
>>
>> val kafkaDS
f you expect invalid messages, you can use flatMap instead and wrap
> .parseFrom in a Try {} .toOption.
>
> Sent from my iPhone
>
> On 17 Sep 2015, at 18:23, srungarapu vamsi <srungarapu1...@gmail.com>
> wrote:
>
> @Adrian,
> I am doing collect for debugging purp
ining how sparkta was born and what it makes:
>>http://www.slideshare.net/Stratio/strata-sparkta
>>
>>
>> Feel free to ask us anything about the project.
>>
>>
>>
>>
>>
>>
>>
>>
>> 2015-09-15 8:10 GMT+02:00 srungarapu vamsi &
15 sept. 2015 à 6:20, srungarapu vamsi <srungarapu1...@gmail.com>
> a écrit :
>
>> I am pretty new to spark. Please suggest a better model for the following
>> use case.
>>
>> I have few (about 1500) devices in field which keep emitting about 100KB
>> of da
I am pretty new to spark. Please suggest a better model for the following
use case.
I have few (about 1500) devices in field which keep emitting about 100KB of
data every minute. The nature of data sent by the devices is just a list of
numbers.
As of now, we have Storm is in the architecture
Hi,
I am using a mesos cluster to run my spark jobs.
I have one mesos-master and two mesos-slaves setup on 2 machines.
On one machine, master and slave are setup and on the second machine
mesos-slave is setup
I run these on m3-large ec2 instances.
1. When i try to submit two jobs using
15 matches
Mail list logo