[jira] [Commented] (PIG-5319) Investigate why TestStoreInstances fails with Spark 2.2

Koji Noguchi (Jira) Tue, 15 Jun 2021 12:31:09 -0700


    [ 
https://issues.apache.org/jira/browse/PIG-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363875#comment-17363875
 ]


Koji Noguchi commented on PIG-5319:
-----------------------------------

I do see OutputFormat created twice (*** below)
 Using Spark-2.4
{code:java|title=SparkHadoopWriter.scala}
117     committer.setupTask(taskContext).  ***
118
119     // Initiate the writer.
120     config.initWriter(taskContext, sparkPartitionId) ***
{code}
Within setupTask and initWriter, each is creating a separate OutputFormat.

Trace for each.
{noformat}
SparkHadoopWriter.scala:117     committer.setupTask(taskContext)
--> HadoopMapReduceCommitProtocol.scala:217 setupCommitter(taskContext)
-->   --> HadoopMapReduceCommitProtocol.scala:94     val format = 
context.getOutputFormatClass.newInstance() 
{noformat}
and
{noformat}
SparkHadoopWriter.scala:120     config.initWriter(taskContext, sparkPartitionId)
--> SparkHadoopWriter.scala:343     val taskFormat = getOutputFormat()
--> --> SparkHadoopWriter.scala:384     outputFormat.newInstance()
{noformat}
 

> Investigate why TestStoreInstances fails with Spark 2.2
> -------------------------------------------------------
>
>                 Key: PIG-5319
>                 URL: https://issues.apache.org/jira/browse/PIG-5319
>             Project: Pig
>          Issue Type: Bug
>          Components: spark
>            Reporter: Nándor Kollár
>            Priority: Major
>
> TestStoreInstances unit test fails with Spark 2.2.x. It seems in job and task 
> commit logic changed a lot since Spark 2.1.x, now it looks like Spark uses a 
> different PigOutputFormat when writing to files, and a different one when 
> getting the OutputCommitters



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PIG-5319) Investigate why TestStoreInstances fails with Spark 2.2

Reply via email to