Avro sink --> Spark Streaming
2017-01-16 13:55 GMT+01:00 ayan guha <guha.a...@gmail.com>:
> With Flume, what would be your sink?
>
>
>
> On Mon, Jan 16, 2017 at 10:44 PM, Guillermo Ortiz <konstt2...@gmail.com>
> wrote:
>
>> I'm wondering to use Flume (c
With Flume, what would be your sink?
On Mon, Jan 16, 2017 at 10:44 PM, Guillermo Ortiz <konstt2...@gmail.com>
wrote:
> I'm wondering to use Flume (channel file)-Spark Streaming.
>
> I have some doubts about it:
>
> 1.The RDD size is all data what it comes in a micr
I'm wondering to use Flume (channel file)-Spark Streaming.
I have some doubts about it:
1.The RDD size is all data what it comes in a microbatch which you have
defined. Risght?
2.If there are 2Gb of data, how many are RDDs generated? just one and I
have to make a repartition?
3.When is the ACK
> flumeStream.print
> ssc.start
>
>
> And getting this execption.
>
> 16/03/20 18:17:17 INFO scheduler.ReceiverTracker: Registered receiver for
> stream 0 from impact3.indigo.co.il:51581
> 16/03/20 18:17:17 WARN scheduler.TaskSetManager: Lost task 0.0 in stage
> 4.0 (TID 76,
$ jar tvf
./external/flume-sink/target/spark-streaming-flume-sink_2.10-1.6.1.jar |
grep SparkFlumeProtocol
841 Thu Mar 03 11:09:36 PST 2016
org/apache/spark/streaming/flume/sink/SparkFlumeProtocol$Callback.class
2363 Thu Mar 03 11:09:36 PST 2016
org/apache/spark/streaming/flume/sink
execption.
16/03/20 18:17:17 INFO scheduler.ReceiverTracker: Registered receiver for
stream 0 from impact3.indigo.co.il:51581
16/03/20 18:17:17 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 4.0
(TID 76, impact3.indigo.co.il): java.lang.NoClassDefFoundError:
org/apache/spark/streaming/