[ https://issues.apache.org/jira/browse/AVRO-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831563#action_12831563 ]
Doug Cutting commented on AVRO-406: ----------------------------------- > if you say only the first enclosing array is 'streaming' that means the > sub-array is NOT streamed, right? The current Avro binary data format supports streaming of nested arrays. For example, BlockingBinaryEncoder implements this so that, if an inner array has millions of elements, it is blocked. Each block is prefixed not just with the number of elements, but also with the number of bytes in the block. So arbitrarily large, nested data structures may already be efficiently streamed through Avro. As I think about it more, I believe the Iterator<Iterator<Foo>> approach has merit. Avro's runtime supports the notion of efficient skipping. It doesn't seem overly complex for the outer iterator to know whether the inner iterator has completed or not, and, if it's not, skip accordingly. I beleive this can be implemented with a call to ParsingDecoder#skipTo(level), where level is the parser's stack level of the outer iterator. Note that, since AVRO-388, all Java Binary decoders are now parser-based, and hence support this. I find the consistency and generality of this model attractive. Thiru, does this sound plausible? > Support streaming RPC calls > --------------------------- > > Key: AVRO-406 > URL: https://issues.apache.org/jira/browse/AVRO-406 > Project: Avro > Issue Type: New Feature > Components: java, spec > Reporter: Todd Lipcon > > Avro nicely supports chunking of container types into multiple frames. We > need to expose this to RPC layer to facilitate use cases like the Hadoop > Datanode where a single "RPC" can yield far more data than should be buffered > in memory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.