Actually, I think I isolated where the error may be.  We have a library that 
was recently updated to fix an issue.  Other code using the same part of the 
library is working properly, but for some reason in this case it isn't.  
Apologies for wasting people's time, but I just never even thought to look 
there since it is working in other places.

Casey
________________________________________
From: Guozhang Wang [wangg...@gmail.com]
Sent: Wednesday, December 11, 2013 12:09 PM
To: users@kafka.apache.org
Subject: Re: Partial Message Read by Consumer

Do you have compression turned on in the broker?

Guozhang


On Wed, Dec 11, 2013 at 8:43 AM, Sybrandy, Casey <
casey.sybra...@six3systems.com> wrote:

> First, I saw the partial message looking at raw network traffic via
> Wireshark, not the output of the iterator as the iterator never seems to
> provide me any data.  That's where the code is hanging.
>
> Second, here's the output from the ConsumerOffsetChecker:
>
> grp1,tdf_topic,0-0 (Group,Topic,BrokerId-PartitionId)
>             Owner = null
>   Consumer offset = 47947
>                   = 47,947 (0.00G)
>          Log size = 1743252
>                   = 1,743,252 (0.00G)
>      Consumer lag = 1695305
>                   = 1,695,305 (0.00G)
>
> BROKER INFO
> 0 -> 127.0.1.1:9092
>
> To answer the questions related to this in the FAQ:
>
> * Yes, there are more messages.
> * No, the messages are all smaller than my configured fetch size.
> * As far as I know, the consumer thread did not stop.  There are no errors
> or exceptions to indicate anything of the sort.
>
> One thing I did notice is that it looks like it's reading from the topic
> before the consumer thread actually starts.  I'm using the pattern where I
> start a new thread per stream and submit them to an ExecutorService.  Not
> sure if this makes a difference, but this is our standard consumer pattern
> and has worked well until I started seeing this issue.  For this consumer,
> I'm only working with one stream.  I tried 2, but no change.
>
> Casey
> ________________________________________
> From: Guozhang Wang [wangg...@gmail.com]
> Sent: Wednesday, December 11, 2013 11:31 AM
> To: users@kafka.apache.org
> Subject: Re: Partial Message Read by Consumer
>
> Casey,
>
> Just to confirm, you saw a partial message output from the iterator.next()
> call, not from the consumer's fetch response, correct?
>
> Guozhang
>
>
> On Wed, Dec 11, 2013 at 8:14 AM, Jun Rao <jun...@gmail.com> wrote:
>
> > Have you looked at
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped%2Cwhy%3F
> > ?
> > If that doesn't help, could you file a jira and attach your log?
> > Apache
> > mailing list doesn't support attachments.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Dec 11, 2013 at 6:15 AM, Sybrandy, Casey <
> > casey.sybra...@six3systems.com> wrote:
> >
> > > Hello,
> > >
> > > No, the entire log file isn't bigger than that buffer size and this is
> > > occurring while trying to retrieve the first message on the topic, not
> > the
> > > last.
> > >
> > > I attached a log.  Line 408 (******** Iterating.) is where we get an
> > > iterator and start iterating over the data.  There should be subsequent
> > log
> > > entries displaying a filename, but they never appear after that point.
> > >
> > > Some other thoughts:
> > >
> > > * Network latency is a non-issue as everything is installed on a local
> > VM.
> > > * I tried with both 10 and 100 messages in case I didn't have enough to
> > > make it start producing.  No change.  Yes, I do realize this is silly,
> > but
> > > when nothing else is working, why not give it a try.  It's like adding
> > > magical print statements.
> > >
> > > Hope this helps.  I need it.
> > >
> > > Casey
> > >
> > > ________________________________________
> > > From: Tom Brown [tombrow...@gmail.com]
> > > Sent: Tuesday, December 10, 2013 7:10 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Partial Message Read by Consumer
> > >
> > > Having a partial message transfer over the network is the design of
> Kafka
> > > 0.7.x (I can't speak to 0.8.x, though it may still be).
> > >
> > > When the request is made, you tell the server the partition number, the
> > > byte offset into that partition, and the size of response that you
> want.
> > > The server finds that offset in the partition, and sends N bytes back
> > > (where N is the maximum response size specified). The server does not
> > > inspect the contents of the reply to ensure that message boundaries
> line
> > up
> > > with the response size. This is by design, and the simplicity allows
> for
> > > high throughput, at the cost of higher client complexity. In practice
> > this
> > > means is that the response often includes a partial message at the end
> > > which the client drops. This means that if the response contains a
> single
> > > message is larger than your maximum response size, you will not be able
> > to
> > > process that message or continue to the next message. Each time you
> > request
> > > it, it will only send the partial message, and the Kafka client will
> send
> > > the request again.
> > >
> > > If I understand the high-level consumer configuration, the fetch.size
> > > parameter should be what you need to adjust. It's default is 300K, but
> I
> > > see you have it set to roughly 50MB. Is there any chance your message
> is
> > > larger than that?
> > >
> > > --Tom
> > >
> > >
> > > On Tue, Dec 10, 2013 at 1:52 PM, Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >
> > > > Hello Casey,
> > > >
> > > > What do you mean by "part of a message is being read"? Could you
> upload
> > > the
> > > > output and also the log of the consumer here?
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Tue, Dec 10, 2013 at 12:26 PM, Sybrandy, Casey <
> > > > casey.sybra...@six3systems.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > First, I'm using version 0.7.2.
> > > > >
> > > > > I'm trying to read some messages from a broker, but looking at
> > > wireshark,
> > > > > it appears that only part of a message is being read by the
> consumer.
> > > > >  After that, no other data is read and I can verify that there are
> 10
> > > > > messages on the broker.  I have the consumer configured as follows:
> > > > >
> > > > > kafka.zk.connectinfo=127.0.0.1
> > > > > kafka.zk.groupid=foo3
> > > > > kafka.topic=...
> > > > > fetch.size=52428800
> > > > > socket.buffersize=524288
> > > > >
> > > > > I only set socket.buffersize today to see if it helps.  Any help
> > would
> > > be
> > > > > great because this is baffling, especially since this only started
> > > > > happening yesterday.
> > > > >
> > > > > Casey Sybrandy MSWE
> > > > > Six3Systems
> > > > > Cyber and Enterprise Systems Group
> > > > > www.six3systems.com
> > > > > 301-206-6000 (Office)
> > > > > 301-206-6020 (Fax)
> > > > > 11820 West Market Place
> > > > > Suites N-P
> > > > > Fulton, MD. 20759
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>



--
-- Guozhang

Reply via email to