Hi all!
I would love to use Spark with a somewhat more modern logging framework
than Log4j 1.2. I have Logback in mind, mostly because it integrates well
with central logging solutions such as the ELK stack. I've read up a bit on
getting Spark 2.0 (that's what I'm using currently) to work with
That's nice and all, but I'd rather have a solution involving mapWithState
of course :) I'm just wondering why it doesn't support this use case yet.
On Tue, Oct 11, 2016 at 3:41 PM, Cody Koeninger wrote:
> They're telling you not to use the old function because it's linear
tions method)
>
> You can look at those methods in DirectKafkaInputDStream and KafkaRDD
> respectively if you want to see an example
>
> On Tue, Sep 13, 2016 at 9:37 AM, Daan Debie <debie.d...@gmail.com> wrote:
> > Ah, that makes it much clearer, thanks!
> >
> >
s how to partition things (this is the
> getPartitions method)
>
> You can look at those methods in DirectKafkaInputDStream and KafkaRDD
> respectively if you want to see an example
>
> On Tue, Sep 13, 2016 at 9:37 AM, Daan Debie <debie.d...@gmail.com> wrote:
> > Ah, tha
a time slot.
> It's one RDD within a time slot, with multiple partitions.
>
> On Tue, Sep 13, 2016 at 9:01 AM, Daan Debie <debie.d...@gmail.com> wrote:
> > Thanks, but that thread does not answer my questions, which are about the
> > distributed nature of RDDs vs the sma
Thanks, but that thread does not answer my questions, which are about the
distributed nature of RDDs vs the small nature of "micro batches" and on
how Spark Streaming distributes work.
On Tue, Sep 13, 2016 at 3:34 PM, Mich Talebzadeh
wrote:
> Hi Daan,
>
> You may find