[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-596387275 Thanks for the updates @gaoyunhaii . I left some other comments related to tests part. After addressing them all, you can also paste the micro-benchmark results to show whether this PR has an impact for network performance. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-592420093 I have finished reviewing the tests codes and left some comments. Besides that, we should further verify the previous e2e which encountered direct OutOfMemory error caused by netty memory overhead before. If we can reduce the frame overhead memory in these tests and running normally based on this PR change, then we can verify the effect and value of this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-591812975 Thanks for the updates and patience @gaoyunhaii . The core codes are almost good now except for one concern of buffer leak left inline comment. I only roughly reviewed the tests part and found two main problems, one is for deduplication and anther is for avoid mocking way in new tests. After you address these issues, then i would review the detail logics in tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-586823140 Let's uniform the new class naming and refactor the structure a bit firstly. - `ZeroCopyNettyMessageDecoder` -> `NettyMessageClientDecoder`: `ZeroCopy` is not an easy understood term or glossary by other persons. From the semantic of itself, actually we have not achieved the goal of zero copy yet. - The previous `NettyMessageDecoder` refactors to respective `NettyMessageServerDecoder` - `NettyMessageParser` -> `NettyMessageDeocderDelegate`: The motivation is to unify the `Parser` and `Decoder` terms. Further we can define it as abstract `NettyMessageDeocderDelegate` to extend `ChannelInboundHandlerAdapter`, then it can make use of existing `channelActive` and `channelInactive` methods to avoid re-define them now. - `BufferResponseParser` -> `BufferResponseDecoderDelegate` - `DefaultMessageParser` -> `NonBufferResponseDecoderDelegate`? The `Default` term is misleading and we should give a more precise semantic. - `ByteBufCumulator` -> `LengthBasedHeaderDecoder` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-584510940 Thanks for the update @gaoyunhaii ! While reviewing the core codes of `ZeroCopyNettyMessageDecoder` and `NettyMessageParser`, I have several concerns: 1. It is hard to extend a new message type future, because it has to understand the implementation detail of above decoder and parser for properly judging this new message in different internal processes. It is better to make the decoder implementation detail transparent for new message extension. 2. There are many interactions among decoder, parser and `BufferResponse`, and it brings many intermediate states to maintain inside decoder and parser. It would make things more complex to understand and easy to cause potential bugs. E.g. The specific message header length is got from `NettyMessage` via interaction with parser and the intermediate state is maintained inside decoder. 3. Actually the real decode work is mainly done by specific `NettyMessage#readFrom`, but now the decoder also involves in partial work which seems fragmented and not unified. Another possible options is to make `ZeroCopyNettyMessageDecoder` more light-weight and remove `NettyMessageParser`. We can delegate all the decode work to respective `NettyMessage`, and the current `ZeroCopyNettyMessageDecoder` can only handle the unified frame header for all the messages as before. We can refactor the current `NettyMessage#readFrom` method to define an abstract `decode` method instead. In order not to effect the existing messages on sender side, we can define a new `NettyClientMessage` to extend `NettyMessage` only for receiver side. The new define `NettyClientMessage#decode` method takes the similar role as `RecordDeserializer`, it is aware of the implementation detail to parse header and body properly, and return the proposed `DecodeResult` to indicate whether the header is fulfilled or the body is fulfilled or the received `ByteBuf` is fully consumed, etc. To do so, the `ZeroCopyNettyMessageDecoder` does not need to know the details of different `NettyMessage` and it only controls the general logic when to fire the full message to next handler. And if extending new message future, it only needs to implement the abstract `decode` method for parsing itself, not aware of the upper `ZeroCopyNettyMessageDecoder`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-582745768 I get the general idea of the implementation, but have not thought the details through yet. This improvement only validates for the body of `BufferResponse` message on receiver side, and we still have memory overheads for other messages and the frame and header parts of `BufferResponse`. I am not caring about the other messages because they are less frequent in running except `AddCredit` which might have the same frequency with `BufferResponse`. Although the size of `AddCredit` is very small, I am not sure whether it would still boost the netty memory overhead in practice. I noticed that the sender side also reuses the same new `ZeroCopyNettyMessageDecoder` with receiver side, and I have some concerns here. This new decoder would maintain/manage the overhead memory for frame and header parts compared with previous embedded `LengthFieldBasedFrameDecoder` in netty. If the new decoder is more efficient or memory saving than the `LengthFieldBasedFrameDecoder` (e.g. for the case of `AddCredit` mentioned above), it is reasonable, otherwise I do not prefer to reuse it on sender side for two reasons: - To do so, actually we do not unify the decoder on both sender and receiver side, because we also define different `NettyMessageParser` implementations inside `ZeroCopyNettyMessageDecoder`, which have the same effect with different decoders on sender and receiver sides. - It brings more changes and seems more complex. We can retain the previous `NettyMessageDecoder` for sender side and only removes the `BufferResponse` path from it, then we define a different decoder only for receiver side. Next I would further think how to refactor the implementation of `ZeroCopyNettyMessageDecoder`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode
zhijiangW commented on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode URL: https://github.com/apache/flink/pull/7368#issuecomment-576116409 @gaoyunhaii , after you rebase the latest codes to solve the conflicts, please ping me then i would review it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services