My bad, I didn't explain the problem well. The value displayed in the log is the amount currently allocated by the ProtobufLengthDecoder.allocator and not the size we are trying to allocate, I will add the size we are trying to the log message and report back here.
I was assuming the RPC layer uses it's own child allocator and it didn't make sense to me this allocator reached > 1GB because it should transfer the batch their corresponding fragment context (we are on the data server side). But then while investigating further I think the ProtobufLengthDecoder is actually using the drillbit top level allocator. am I right ? This would explain why the allocator reached it's limit. Any reason the RPC layer isn't using it's own child allocator ? Thanks! On Tue, Jul 7, 2015 at 10:02 AM, Jacques Nadeau <[email protected]> wrote: > There is a time where data is read off the socket before we know what type > of message it is. This socket read buffer is outside the normal flow and > could grow (although it shouldn't get this big). However, the memory > you're talking about here is memory allocated due to the size of the > incoming message. My guess would be either you have unusually large > records or the length of the message being sent was corrupted. (Assuming > you are talking about the allocation at [1]). > > I would start logging unusually large record batches and see if something > weird is going on. A record batch shouldn't be larger than 65k records so > for the batch to be 1gb in size would require each record to be 16k in size > and for the batch to be the maximum number of records. More realistically, > we generally target 4k records in a batch which would suggest records that > are 256k. > > [1] > > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/ProtobufLengthDecoder.java#L87 > > On Tue, Jul 7, 2015 at 9:13 AM, Abdel Hakim Deneche <[email protected] > > > wrote: > > > Trying to investigate DRILL-3241 > > <https://issues.apache.org/jira/browse/DRILL-3241> (query hangs if out > of > > memory in RPC layer), I see the following warning in the logs: > > > > WARN: Failure allocating buffer on incoming stream due to > > > > memory limits. Current Allocation: 1372678764. > > > > > > This happening in ProtobufLengthDecoder.decode() on the receiver side > (data > > server). > > > > Is it expected for the connection allocation to allocation > 1GB of > memory > > ? shouldn't the allocated batches be transferred to the receiving > > fragment's allocator ? > > > > Thanks! > > > > -- > > > > Abdelhakim Deneche > > > > Software Engineer > > > > <http://www.mapr.com/> > > > > > > Now Available - Free Hadoop On-Demand Training > > < > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > -- Abdelhakim Deneche Software Engineer <http://www.mapr.com/> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
