Re: Spark streaming for synchronous API

Tobias Pfeiffer Tue, 09 Sep 2014 19:48:07 -0700

Hi again,

On Tue, Sep 9, 2014 at 2:20 PM, Tobias Pfeiffer <t...@preferred.jp> wrote:
>
> On Tue, Sep 9, 2014 at 2:02 PM, Ron's Yahoo! <zlgonza...@yahoo.com> wrote:
>>
>>   For example, let’s say there’s a particular topic T1 in a Kafka queue.
>> If I have a new set of requests coming from a particular client A, I was
>> wondering if I could create a partition A.
>>   The streaming job is submitted to listen to T1.A and will write to a
>> topic T2.A, which the REST endpoint would be listening on.
>>
>
> That doesn't seem like a good way to use Kafka. It may be possible, but I
> am pretty sure you should create a new topic T_A instead of a partition A
> in an existing topic. With some modifications of Spark Streaming's
> KafkaReceiver you *might* be able to get it to work as you imagine, but it
> was not meant to be that way, I think.
>


Maybe I was wrong about a new topic being the better way. Looking, for
example, at the way that Samza consumes Kafka streams <
http://samza.incubator.apache.org/learn/documentation/latest/introduction/concepts.html>,
it seems like there is one task per partition and data can go into
partitions keyed by user ID. So maybe a new partition is actually the
conceptually better way.

Nonetheless, the built-in KafkaReceiver doesn't support assignment of
partitions to receivers AFAIK ;-)

Tobias

Re: Spark streaming for synchronous API

Reply via email to