Another question is clarifying what BinStorage does compared to
InterStorage. It looks like it might just be a legacy storage format?

I'm assuming that you do the R_1/R_2/R_3 to be able to find the next Tuple
in the stream, but once you do that, can't you just read a tuple, and then
read skip 12 bytes (3 ints), and keep reading?

2012/5/23 Jonathan Coveney <jcove...@gmail.com>

> I'm trying to understand how intermediate serialization in Pig works at a
> deeper level (understanding the whole code path, not just BinInterSedes in
> its own vaccuum). Right now I am looking at
> InterRecordReader/InterRecordWriter/InterStorage. Is that the right place
> to look for understanding how BinInterSedes is actually called?
>
> Further, I'm trying to better understanding the RECORD_1/RECORD_2/RECORD_3
> thing. My guess is that it's to make the file splittable? But I'm not
> really sure. I'd love any pointers about where to look for how
> BinInterSedes is used, and how intermediate storage happens.
>
> Thanks!
> Jon
>

Reply via email to