[ 
https://issues.apache.org/jira/browse/PIG-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13781238#comment-13781238
 ] 

Aniket Mokashi commented on PIG-3480:
-------------------------------------

bq. evaluate effect on size of compressed data for TFile vs SeqFile when TFile 
does work
https://issues.apache.org/jira/browse/HADOOP-3315?focusedCommentId=12631905&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12631905
 has some benchmark details for SequenceFile vs TFile.
bq. add tests, make TFile tests pass (in this file they fail, because of course 
TFile is not being used)
I will submit a patch for this.
bq. make SeqFile the default method, since it doesn't break
+1 for this as the effect is not substantially worse.
bq. allow TFile use by a switch, since current users may want to keep it. I 
would prefer to not do that, but might if the first step shows significant 
differences.
[~rohini], what are your thoughts on this?

> TFile-based tmpfile compression crashes in some cases
> -----------------------------------------------------
>
>                 Key: PIG-3480
>                 URL: https://issues.apache.org/jira/browse/PIG-3480
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>             Fix For: 0.12.0
>
>         Attachments: PIG-3480.patch
>
>
> When pig tmpfile compression is on, some jobs fail inside core hadoop 
> internals.
> Suspect TFile is the problem, because an experiment in replacing TFile with 
> SequenceFile succeeded.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to