[ https://issues.apache.org/jira/browse/PIG-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779054#action_12779054 ]
Thejas M Nair commented on PIG-1062: ------------------------------------ {quote} In SampleLoader.java ==================== Isn't the idea of SampleLoader only to carry common code for RandomSampleLoader and PoissonLoader and add a computeSamples() method? - Looks like now it has the getNext() implementation needed by RandomSampleLoader in it now. Should we move that to RandomSampleLoader instead? {quote} RandomSampleLoader.getNext() is fairly generic, it can be used by any new sample loader classes where the number of samples to be sampled in each map is known in advance. So having this getNext() implementation in SampleLoader can be useful in future. {quote} Why is skipNext() needed? Can't loader.getNext() == null be used instead? If so, is recordReader needed? {quote} skipNext() calls recordReader.getNext() which does not parse the record in to a tuple, unlike loader.getNext(). This way records can be more efficiently skipped. I will create a new patch addressing other comments. > load-store-redesign branch: change SampleLoader and subclasses to work with > new LoadFunc interface > --------------------------------------------------------------------------------------------------- > > Key: PIG-1062 > URL: https://issues.apache.org/jira/browse/PIG-1062 > Project: Pig > Issue Type: Sub-task > Reporter: Thejas M Nair > Assignee: Thejas M Nair > Attachments: PIG-1062.patch, PIG-1062.patch.3 > > > This is part of the effort to implement new load store interfaces as laid out > in http://wiki.apache.org/pig/LoadStoreRedesignProposal . > PigStorage and BinStorage are now working. > SampleLoader and subclasses -RandomSampleLoader, PoissonSampleLoader need to > be changed to work with new LoadFunc interface. > Fixing SampleLoader and RandomSampleLoader will get order-by queries working. > PoissonSampleLoader is used by skew join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.