[ https://issues.apache.org/jira/browse/SPARK-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285366#comment-14285366 ]
Hong Shen commented on SPARK-5347: ---------------------------------- It's because in HadoopRDD, inputMetrics only been set when split is instanceOf FileSplit, but CombineFileInputFormat use InputSplit. It's not nessesary to instanceOf FileSplit, only have to instanceOf InputSplit. {code} if (bytesReadCallback.isDefined) { val bytesReadFn = bytesReadCallback.get inputMetrics.bytesRead = bytesReadFn() } else if (split.inputSplit.value.isInstanceOf[FileSplit]) { // If we can't get the bytes read from the FS stats, fall back to the split size, // which may be inaccurate. try { inputMetrics.bytesRead = split.inputSplit.value.getLength context.taskMetrics.inputMetrics = Some(inputMetrics) } catch { case e: java.io.IOException => logWarning("Unable to get input size to set InputMetrics for task", e) } } {code} > InputMetrics bug when inputSplit is not instanceOf FileSplit > ------------------------------------------------------------ > > Key: SPARK-5347 > URL: https://issues.apache.org/jira/browse/SPARK-5347 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.2.0 > Reporter: Hong Shen > > When inputFormatClass is set to CombineFileInputFormat, input metrics show > that input is empty. It don't appear is spark-1.1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org