Another question is clarifying what BinStorage does compared to InterStorage. It looks like it might just be a legacy storage format?
I'm assuming that you do the R_1/R_2/R_3 to be able to find the next Tuple in the stream, but once you do that, can't you just read a tuple, and then read skip 12 bytes (3 ints), and keep reading? 2012/5/23 Jonathan Coveney <jcove...@gmail.com> > I'm trying to understand how intermediate serialization in Pig works at a > deeper level (understanding the whole code path, not just BinInterSedes in > its own vaccuum). Right now I am looking at > InterRecordReader/InterRecordWriter/InterStorage. Is that the right place > to look for understanding how BinInterSedes is actually called? > > Further, I'm trying to better understanding the RECORD_1/RECORD_2/RECORD_3 > thing. My guess is that it's to make the file splittable? But I'm not > really sure. I'd love any pointers about where to look for how > BinInterSedes is used, and how intermediate storage happens. > > Thanks! > Jon >