Avro sink --> Spark Streaming

2017-01-16 13:55 GMT+01:00 ayan guha <guha.a...@gmail.com>:

> With Flume, what would be your sink?
>
>
>
> On Mon, Jan 16, 2017 at 10:44 PM, Guillermo Ortiz <konstt2...@gmail.com>
> wrote:
>
>> I'm wondering to use Flume (channel file)-Spark Streaming.
>>
>> I have some doubts about it:
>>
>> 1.The RDD size is all data what it comes in a microbatch which you have
>> defined. Risght?
>>
>> 2.If there are 2Gb of data, how many are RDDs generated? just one and I
>> have to make a repartition?
>>
>> 3.When is the ACK sent back  from Spark to Flume?
>>   I guess that if Flume dies, Flume is going to send the same data again
>> to Spark
>>   If Spark dies, I have no idea if Spark is going to reprocessing same
>> data again when it is sent again.
>>   Coult it be different if I use Kafka Channel?
>>
>>
>>
>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>

Reply via email to