[ 
https://issues.apache.org/jira/browse/PIG-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964023#comment-15964023
 ] 

liyunzhang_intel commented on PIG-5176:
---------------------------------------

[~nkollar]: if use the patch, user can not upload the file with same name twice 
even not use netty file server in spark1.6. We really want that case?
If yes, we should document that  user can not ship file with same name twice in 
pig on spark otherwise let users to ship file with same twice when they don't 
use netty file server.  Another solution we can resolve this jira after we 
upgrade to spark2.1.  Can you give me any suggestion?

> Several ComputeSpec test cases fail
> -----------------------------------
>
>                 Key: PIG-5176
>                 URL: https://issues.apache.org/jira/browse/PIG-5176
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Nandor Kollar
>            Assignee: Nandor Kollar
>             Fix For: spark-branch
>
>         Attachments: PIG-5176.patch
>
>
> Several ComputeSpec test cases failed on my cluster:
> ComputeSpec_5 - ComputeSpec_13
> These scripts have a ship() part in the define, where the ship includes the 
> script file too, so we add the same file to spark context twice. This is not 
> a problem with Hadoop, but looks like Spark doesn't like adding the same 
> filename twice:
> {code}
> Caused by: java.lang.IllegalArgumentException: requirement failed: File 
> PigStreamingDepend.pl already registered.
>         at scala.Predef$.require(Predef.scala:233)
>         at 
> org.apache.spark.rpc.netty.NettyStreamManager.addFile(NettyStreamManager.scala:69)
>         at org.apache.spark.SparkContext.addFile(SparkContext.scala:1386)
>         at org.apache.spark.SparkContext.addFile(SparkContext.scala:1348)
>         at 
> org.apache.spark.api.java.JavaSparkContext.addFile(JavaSparkContext.scala:662)
>         at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.addResourceToSparkJobWorkingDirectory(SparkLauncher.java:462)
>         at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.shipFiles(SparkLauncher.java:371)
>         at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.addFilesToSparkJob(SparkLauncher.java:357)
>         at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.uploadResources(SparkLauncher.java:235)
>         at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:222)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to