Hi All,
I'm newbie in spark mlib. In my office I have a statistician who work on
improving our matrix model for our recommendation engine. However he works
on R. He told me that it's quite possible to combine the collaborative
filtering and latent dirichlet allocation (LDA) by doing some
I think querying by cassandra query language will be better in terms of
performance if you want to pull and filter the data from your db, rather
than pulling all of the data and do some filtering and transformation by
using spark data frame.
On 31 Mar 2016 22:19, "asethia"
I'm dont know how to read the data from the checkpoint. But AFAIK and based
on my experience, I think the best thing that you can do is storing the
offset to a particular storage such as database everytime you consume the
message. Then read the offset from the database everytime you want to start
Hi,
I have a spark application for batch processing in standalone cluster. The
job is to query the database and then do some transformation, aggregation,
and several actions such as indexing the result into the elasticsearch.
If I dont call the sc.stop(), the spark application wont stop and take
Hi,
I'm just trying to process the data that come from the kafka source in my
spark streaming application. What I want to do is get the pair of topic and
message in a tuple from the message stream.
Here is my streams:
val streams = KafkaUtils.createDirectStream[String, Array[Byte],
>
Hi,
I'm just trying to process the data that come from the kafka source in my
spark streaming application. What I want to do is get the pair of topic and
message in a tuple from the message stream.
Here is my streams:
val streams = KafkaUtils.createDirectStream[String, Array[Byte],
>
keep it this way:
> >>
> >> val stream1 = KafkaUtils.createStream(..) // for topic 1
> >>
> >> val stream2 = KafkaUtils.createStream(..) // for topic 2
> >>
> >>
> >> And you will know which stream belongs to which topic.
> >>
>
creating. Like, create a
> tuple(topic, stream) and you will be able to access ._1 as topic and ._2 as
> the stream.
>
>
> Thanks
> Best Regards
>
> On Tue, Mar 15, 2016 at 12:05 PM, Imre Nagi <imre.nagi2...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm j
Hi,
I'm just trying to create a spark streaming application that consumes more
than one topics sent by kafka. Then, I want to do different further
processing for data sent by each topic.
val kafkaStreams = {
> val kafkaParameter = for (consumerGroup <- consumerGroups) yield {
>
Do you mean listening to the twitter stream data? Maybe you can use the
Twitter Stream API or Twitter Search API for this purpose.
Imre
On Tue, Mar 8, 2016 at 2:54 PM, Soni spark wrote:
> Hallo friends,
>
> I need a urgent help.
>
> I am using spark streaming to get
10 matches
Mail list logo