Re: Need guidelines in Spark Streaming and Kafka integration

2016-11-16 Thread Karim, Md. Rezaul
Hi Tariq and Jon,

At first thanks for quick response. I really appreciate that.

Well, I would like to start from the very begging of using Kafka with
Spark. For example, in the Spark distribution, I found an example using
Kafka with Spark streaming that demonstrates a Direct Kafka Word Count
example. In that example, I found the main class
*JavaDirectKafkaWordCount.java* under the
spark-2.0.0-bin-hadoop2.7\examples\src\main\java\org\apache\spark\examples\streaming
directory) that contains a code segment as follows:


---*-
String brokers = args[0];
String topics = args[1];

// Create context with a 2 seconds batch interval
SparkConf sparkConf = new
SparkConf().setAppName("JavaDirectKafkaWordCount").setMaster("local[*]");
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf,
Durations.seconds(20));

Set topicsSet = new HashSet<>(Arrays.asList(topics.split(",")));
Map<String, String> kafkaParams = new HashMap<>();
kafkaParams.put("metadata.broker.list", brokers);
---*-

In this code block, the confusing part is setting the values of two command
line arguments (i.e., *brokers *and *topics*). I tried to set them as
follows:

String brokers = "localhost:8890,localhost:8892";
String topics = " topic1,topic2";

However, I know this is not the right way to do so. But there has to have
the correct ways of setting the value of the brokers and topics.

Now, the thing is that I need help how to set/configure these two
parameters so that I can run this hello world like example successfully.
Any kind of help would be highly appreciated.




Regards,
_
*Md. Rezaul Karim* BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireland
Web: http://www.reza-analytics.eu/index.html
<http://139.59.184.114/index.html>

On 17 November 2016 at 03:08, Jon Gregg <jonrgr...@gmail.com> wrote:

> Since you're completely new to Kafka, I would start with the Kafka docs (
> https://kafka.apache.org/documentation).  You should be able to get
> through the Getting Started part easily and there are some examples for
> setting up a basic Kafka server.
>
> You don't need Kafka to start working with Spark Streaming (there are
> examples online to pull directly from Twitter, for example).  But at a high
> level if you're sending data from one server to another, it can be
> beneficial to send the messages to a distributed queue first for durable
> storage (so data doesn't get lost in transmission) and other benefits.
>
> On Wed, Nov 16, 2016 at 2:12 PM, Mohammad Tariq <donta...@gmail.com>
> wrote:
>
>> Hi Karim,
>>
>> Are you looking for something specific? Some information about your
>> usecase would be really  helpful in order to answer your question.
>>
>>
>> On Wednesday, November 16, 2016, Karim, Md. Rezaul <
>> rezaul.ka...@insight-centre.org> wrote:
>>
>>> Hi All,
>>>
>>> I am completely new with Kafka. I was wondering if somebody could
>>> provide me some guidelines on how to develop real-time streaming
>>> applications using Spark Streaming API with Kafka.
>>>
>>> I am aware the Spark Streaming  and Kafka integration [1]. However, a
>>> real life example should be better to start?
>>>
>>>
>>>
>>> 1. http://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>> _
>>> *Md. Rezaul Karim* BSc, MSc
>>> PhD Researcher, INSIGHT Centre for Data Analytics
>>> National University of Ireland, Galway
>>> IDA Business Park, Dangan, Galway, Ireland
>>> Web: http://www.reza-analytics.eu/index.html
>>> <http://139.59.184.114/index.html>
>>>
>>
>>
>> --
>>
>>
>> [image: http://]
>>
>> Tariq, Mohammad
>> about.me/mti
>> [image: http://]
>> <http://about.me/mti>
>>
>>
>>
>


Need guidelines in Spark Streaming and Kafka integration

2016-11-16 Thread Karim, Md. Rezaul
Hi All,

I am completely new with Kafka. I was wondering if somebody could provide
me some guidelines on how to develop real-time streaming applications using
Spark Streaming API with Kafka.

I am aware the Spark Streaming  and Kafka integration [1]. However, a real
life example should be better to start?



1. http://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html





Regards,
_
*Md. Rezaul Karim* BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireland
Web: http://www.reza-analytics.eu/index.html



Re: How to include book title at "Books" section on Spark website

2016-10-09 Thread Karim, Md. Rezaul
Hi Owen,

Thanks so much for the quick response. The book is already available online
as an Alpha. It would be great and appreciated if you could add the title
to the Spark website.

Here's the related information about the book:

*Title: *Large Scale Machine Learning with Spark
*Author:* Md. Rezaul Karim and Md. Mahedi Kaysar
*Publisher:* Packt Publishing (UK)
U*RL:* https
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>
://
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>
www.packtpub.com
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>
/
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>
big-data-and-business-intelligence
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>
/
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>
large-scale-machine-learning-spark
<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>

<https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark>




Regards,
_
*Md**. **Rezaul** Karim*, BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics,
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireand



On Oct 9, 2016 9:22 AM, "Sean Owen" <so...@cloudera.com> wrote:

> I can add it, just send me the info once it's available.
>
> On Sat, Oct 8, 2016 at 7:45 PM Karim, Md. Rezaul <
> rezaul.ka...@insight-centre.org> wrote:
>
>> Hi,
>>
>> I am writing a book on machine learning using Spark, which is going to be
>> published soon.
>>
>> Could anyone tell me how to include the title to this page
>> http://spark.apache.org/documentation.html at "Books" section?
>>
>>
>> Regards,
>> _
>> *Md. Rezaul Karim* BSc, MSc
>> PhD Researcher, INSIGHT Centre for Data Analytics
>> National University of Ireland, Galway
>> IDA Business Park, Dangan, Galway, Ireland
>>
>>
>>