invoking kafka consumer as soon as message arrives in topic

2015-10-26 Thread Kudumula, Surender
Hi all
Iam trying to write a web application which is invoked when the message arrives 
in topic. The code is waiting for a message in kafka consumer in while loop and 
sometimes it picks up the message and sometimes its waiting forever even when 
the message is produced in the topic. I am invoking the kafka consumer in 
servlet init method as a result its waiting in the while and not fully 
deploying the web app. I would appreciate any suggestions as iam looking to 
invoke my consumer as soon any message arrives in topic.
Map topicMap = new HashMap();
// 1 represents the single thread
topicMap.put(topic, new Integer(1));
Map>> 
consumerStreamsMap = consumerConnector
   
.createMessageStreams(topicMap);
// Get the list of message streams for each topic, 
using the default
// decoder.
KafkaStream stream = 
consumerStreamsMap.get(topic).get(0);
ConsumerIterator it = 
stream.iterator();
try {
 // for (final KafkaStream stream : streamList) {
 // ConsumerIterator 
consumerIte = stream.iterator();
 // reads from Kafka until you stop it.
 while (it.hasNext()) {
   // 
System.out.println("Message from Single Topic :: " +
   // new
   byte[] msg = 
((MessageAndMetadata) it.next()).message();
   // byte[] rawMsg = 
msg.message();
   // byte[] ObjInBytes = 
it.next().message();
   final RequestMessage 
msgAsObject = convertFromBytes(msg);
   if (msgAsObject != null) 
{
 Thread 
thread = new Thread() {

   public void run() {

logger.info("Starting new Thread for request Object" + 
msgAsObject.getRequestId());

StateMachine sm = new StateMachine(msgAsObject);

sm.processPdfaRequest();

System.out.println("Request Fully completed");

logger.info("Request Fully completed");

Thread.currentThread().stop();

sm = null;

   }
 };
 
thread.run();
   }
 }
} catch (Exception e) {
 e.printStackTrace();
}

Regards

Surender Kudumula
Big Data Consultant - EMEA
Analytics & Data Management

surender.kudum...@hpe.com
M +44 7795970923

Hewlett-Packard Enterprise
Cain Rd,
Bracknell
RG12 1HN
UK

[http://graphics8.nytimes.com/images/2015/06/03/technology/03bits-hp/03bits-hp-master315.png]



Re: Queue Full

2015-10-26 Thread 马哲超
You can try to set "queue.buffering.max.messages" larger, for example:

confParam=["queue.buffering.max.messages=200",
"batch.num.messages=1000"]

And reference for you:
https://github.com/edenhill/librdkafka/issues/210

2015-10-27 1:19 GMT+08:00 Prabhjot Bharaj :

> Hi,
>
> This is a type of problem where you operating more than network capacity.
> This can be handled at two places (you decide whichever is useful/practical
> for use case)  :-
>
> 1. In case the bottleneck is because of the broker slowness, increase the
> number of partitions of your topic, balance them out to newer nodes (if
> required) and retry. Make sure your producer can write to the new
> partitions as well
> 2. In case this is because your producer is creating too many messages,
> there are 2 ways you can commit them (select the most appropriate according
> to your use case. I'm considering that you dont want to lose your
> messages):-
> --> a. increase the number of producer machines (i.e. balance your
> producer load among more machines)
> --->b. produce at a slower rate (in case your producers are not
> scalable, you are in a problem. you can solve this problem by sacrificing
> latency). Write to your queue in batches and wait for it to get cleared
> before you commit the next batch to your local queue.
>
> Regards,
> Prabcs
>
>
> On Mon, Oct 26, 2015 at 10:41 PM, Eduardo Costa Alfaia <
> e.costaalf...@unibs.it> wrote:
>
> > Hi Magnus
> > I think this answer
> > c) producing messages at a higher rate than the network or broker can
> > handle
> > How could I manager this?
> >
> >
> > > On 26 Oct 2015, at 17:45, Magnus Edenhill  wrote:
> > >
> > > c) producing messages at a higher rate than the network or broker can
> > > handle
> >
> >
> > --
> > Informativa sulla Privacy: http://www.unibs.it/node/8155
> >
>
>
>
> --
> -
> "There are only 10 types of people in the world: Those who understand
> binary, and those who don't"
>


Regarding the Kafka offset management issue in Direct Stream Approach.

2015-10-26 Thread Charan Ganga Phani Adabala
Hi All,

We are working in Apache spark with Kafka integration, in this use case we are 
using DirectStream approach. we want to avoid the data loss in this approach 
for actually we take offsets and saving that offset into MongoDB.
We want some clarification is Spark stores any offsets internally, let us 
explain some example :
For the first rdd batch we get 0 to 5 offsets of events to be processed, but 
unexpectedly the application is crashed, then we started aging the application, 
then this job fetches again from 0 to 5 events or where the event stopped in 
previous job.
We are not committing any offsets in the above process, because we have to 
commit offsets manually in DirectStream approach. Is that new job fetches 
events form 0th position.


Thanks & Regards,
Ganga Phani Charan Adabala | Software Engineer
o:  +91-40-23116680 | c:  +91-9491418099
e:  char...@eiqnetworks.com
[cid:image001.jpg@01CF60B1.87C0C870]
EiQ Networks(r), Inc. |  www.eiqnetworks.com
www.socvue.com | 
www.eiqfederal.com

[Blog]Blog   
[Twitter]   
Twitter   [LinkedIn] 
  
LinkedIn   [Facebook] 
  
Facebook

"This email is intended only for the use of the individual or entity named 
above and may contain information that is confidential and privileged. If you 
are not the intended recipient, you are hereby notified that any dissemination, 
distribution or copying of the email is strictly prohibited. If you have 
received this email in error, please destroy the original message."




Kafka Best Practices in AWS

2015-10-26 Thread Jennifer Fountain
We installed four nodes (r3.xlarge) with approx 100 topics.  Some topics
have a replication factor of 4, some have 3 and some have 2. Each node is
running about 90% cpu usage and we are seeing performance issues with
messages consumption.

Would anyone have any words of wisdom or recommendations for AWS?  Should
we add more nodes? What should be the replication factor for x amount of
nodes?

Thank you!

-- 


Jennifer Fountain
DevOPS


Re: Regarding the Kafka offset management issue in Direct Stream Approach.

2015-10-26 Thread Cody Koeninger
Questions about spark's kafka integration should probably be directed to
the spark user mailing list, not this one.  I don't monitor kafka mailing
lists as closely, for instance.

For the direct stream, Spark doesn't keep any state regarding offsets,
unless you enable checkpointing.  Have you read
https://github.com/koeninger/kafka-exactly-once/blob/master/blogpost.md





On Mon, Oct 26, 2015 at 3:43 AM, Charan Ganga Phani Adabala <
char...@eiqnetworks.com> wrote:

> Hi All,
>
>
>
> We are working in Apache spark with Kafka integration, in this use case we
> are using DirectStream approach. we want to avoid the data loss in this
> approach for actually we take offsets and saving that offset into MongoDB.
>
> We want some clarification is Spark stores any offsets internally, let us
> explain some *example* :
>
> For the first rdd batch *we get 0 to 5 offsets of events to be processed*,
> but *unexpectedly* the application is crashed, then we started aging the
> application, then this job *fetches again from 0 to 5 events or where the
> event stopped in previous job.*
>
> *We are not committing any offsets in the above process, because we have
> to commit offsets manually in DirectStream approach. Is that new job
> fetches events form 0th position.*
>
>
>
>
>
> Thanks & Regards,
>
> *Ganga Phani Charan Adabala | Software Engineer*
>
> o:  +91-40-23116680 | c:  +91-9491418099
>
> e:  char...@eiqnetworks.com
>
> [image: cid:image001.jpg@01CF60B1.87C0C870]
> *EiQ Networks®, Inc.* |  www.eiqnetworks.com
>
> *www.socvue.com * | www.eiqfederal.com
>
>
>
> [image: Blog] Blog
>    [image: Twitter]
>  Twitter
>    [image: LinkedIn]
>  LinkedIn
>    [image: Facebook]
>  Facebook
> 
>
>
>
> *"This email is intended only for the use of the individual or entity
> named above and may contain information that is confidential and
> privileged. If you are not the intended recipient, you are hereby notified
> that any dissemination, distribution or copying of the email is strictly
> prohibited. If you have received this email in error, please destroy
> the original message."*
>
>
>
>
>


Queue Full

2015-10-26 Thread Eduardo Costa Alfaia
Hi Guys,

How could I solving this problem?

% Failed to produce message: Local: Queue full
% Failed to produce message: Local: Queue full

Thanks

-- 
Informativa sulla Privacy: http://www.unibs.it/node/8155


Re: Queue Full

2015-10-26 Thread Eduardo Costa Alfaia
Hi Magnus
I think this answer
c) producing messages at a higher rate than the network or broker can
handle
How could I manager this?


> On 26 Oct 2015, at 17:45, Magnus Edenhill  wrote:
> 
> c) producing messages at a higher rate than the network or broker can
> handle


-- 
Informativa sulla Privacy: http://www.unibs.it/node/8155


Re: Queue Full

2015-10-26 Thread Prabhjot Bharaj
Hi,

This is a type of problem where you operating more than network capacity.
This can be handled at two places (you decide whichever is useful/practical
for use case)  :-

1. In case the bottleneck is because of the broker slowness, increase the
number of partitions of your topic, balance them out to newer nodes (if
required) and retry. Make sure your producer can write to the new
partitions as well
2. In case this is because your producer is creating too many messages,
there are 2 ways you can commit them (select the most appropriate according
to your use case. I'm considering that you dont want to lose your
messages):-
--> a. increase the number of producer machines (i.e. balance your
producer load among more machines)
--->b. produce at a slower rate (in case your producers are not
scalable, you are in a problem. you can solve this problem by sacrificing
latency). Write to your queue in batches and wait for it to get cleared
before you commit the next batch to your local queue.

Regards,
Prabcs


On Mon, Oct 26, 2015 at 10:41 PM, Eduardo Costa Alfaia <
e.costaalf...@unibs.it> wrote:

> Hi Magnus
> I think this answer
> c) producing messages at a higher rate than the network or broker can
> handle
> How could I manager this?
>
>
> > On 26 Oct 2015, at 17:45, Magnus Edenhill  wrote:
> >
> > c) producing messages at a higher rate than the network or broker can
> > handle
>
>
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>



-- 
-
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"


Re: Queue Full

2015-10-26 Thread Magnus Edenhill
That's librdkafka's producer queue that has run full because:
 a) client is not connected to broker
 b) no leader broker for relevant partitions
 c) producing messages at a higher rate than the network or broker can
handle


/Magnus




2015-10-26 16:49 GMT+01:00 Eduardo Costa Alfaia :

> Hi Guys,
>
> How could I solving this problem?
>
> % Failed to produce message: Local: Queue full
> % Failed to produce message: Local: Queue full
>
> Thanks
>
> --
> Informativa sulla Privacy: http://www.unibs.it/node/8155
>


Re: Kafka Best Practices in AWS

2015-10-26 Thread Steve Brandon
Depending on the consumer, I tend to work with c4 types and have found
better success. Although we run about 200+ consumers per server on 16 core
servers and see moderate load.  YMMV
On Oct 26, 2015 8:32 AM, "Jennifer Fountain"  wrote:

> We installed four nodes (r3.xlarge) with approx 100 topics.  Some topics
> have a replication factor of 4, some have 3 and some have 2. Each node is
> running about 90% cpu usage and we are seeing performance issues with
> messages consumption.
>
> Would anyone have any words of wisdom or recommendations for AWS?  Should
> we add more nodes? What should be the replication factor for x amount of
> nodes?
>
> Thank you!
>
> --
>
>
> Jennifer Fountain
> DevOPS
>