[ 
https://issues.apache.org/jira/browse/PIG-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168572#comment-15168572
 ] 

Pallavi Rao commented on PIG-4788:
----------------------------------

[~kellyzly], as you mentioned, since the wrappedSplits are used and not 
directly the PigSplit, this change may be ok. But, I think we should still 
check with pig-dev@ to make sure it has no other impact before making this 
change.

The other option I can think of to avoid change to the main codebase is to have 
PigSplitSpark class that basically is a copy of PigSplit, but, extends 
FileSplit. This class can then be used in PigInputFormatSpark in place of 
PigSplit. Yes. there will be redundancy of code, but, it is a safer change.

> the value BytesRead metric info always returns 0 even the length of input 
> file is not 0 in spark engine
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-4788
>                 URL: https://issues.apache.org/jira/browse/PIG-4788
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4788.patch
>
>
> In 
> [JobMetricsLinstener#onTaskEnd|https://github.com/apache/pig/blob/spark/src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java#L140],
>  taskMetrics.inputMetrics().get().bytesRead() always returns 0 even the 
> length of input file is not zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to