[ https://issues.apache.org/jira/browse/PIG-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897455#action_12897455 ]
Thejas M Nair commented on PIG-1501: ------------------------------------ Why was TFile chosen over SequenceFile ? I am wondering if the additional unused features of TFile (index, metadata) result in any overhead compared to SequenceFile. > need to investigate the impact of compression on pig performance > ---------------------------------------------------------------- > > Key: PIG-1501 > URL: https://issues.apache.org/jira/browse/PIG-1501 > Project: Pig > Issue Type: Test > Reporter: Olga Natkovich > Assignee: Yan Zhou > Fix For: 0.8.0 > > Attachments: compress_perf_data.txt, compress_perf_data_2.txt, > PIG-1501.patch > > > We would like to understand how compressing map results as well as well as > reducer output in a chain of MR jobs impacts performance. We can use PigMix > queries for this investigation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.