Okay, I checked on this. The producer computes the checksum twice, which is
once too many. The server computes the checksum an astounding 39 times in
0.8 branch! I think this answers the earlier questions: The right thing to
do is clearly to remove the check from the iterator as people are iterating
many times.

-Jay

On Fri, Sep 14, 2012 at 9:12 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> I actually think the validBytes() call is looping and hence computing CRCs
> too. So actually at least three times.
>
> -Jay
>
>
> On Fri, Sep 14, 2012 at 9:10 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
>> Hey guys,
>>
>> I noticed that our ByteBufferMessageSet iterator computes the message CRC
>> on iteration. Log.append also loops through all messages and checks the
>> same checksum. This means we are effectively computing the CRC32 twice
>> inside Log.append (we may actually be doing it more than that we would have
>> to look for other times we loop through the message set). This is actually
>> quite expensive. Does anyone remember why the second check was added? I am
>> not completely sure of the right fix--intuitively I think we should remove
>> the automatic check in the iterator and instead add a check in the
>> high-level consumer. I think it is okay for the simple consumer to be left
>> to manually call isValid or not since it is a lower-level interface.
>>
>> Thoughts?
>>
>> -Jay
>>
>
>

Reply via email to