[ https://issues.apache.org/jira/browse/PIG-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906709#comment-15906709 ]
liyunzhang_intel commented on PIG-5177: --------------------------------------- [~szita]: thanks for detail explanation. In mr mode, it create a jar and put the script files into it. mr uploads the jar to hdfs distributed cache and save the jar path in ${{mapred.job.classpath.files}} of the configuration(detail see in JobControlCompiler#putJarOnClassPathThroughDistributedCache). So in ScriptEngine#getScripotAsStream, mr can load the script file from the class loader in yarn container later. Is my understanding right? If yes, my question is can spark executor load the script file from the class loader if we also wrap the script files into a jar, upload the jar to hdfs distributed cache and save the jar path in ${{mapred.job.classpath.files}}? > Scripting and StreamingPythonUDFs fail with Spark exec type > ----------------------------------------------------------- > > Key: PIG-5177 > URL: https://issues.apache.org/jira/browse/PIG-5177 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: Adam Szita > Assignee: Adam Szita > Fix For: spark-branch > > Attachments: PIG-5177.0.patch, PIG-5177.1.patch, PIG-5177.2.patch > > > We are thrown an exception because the Python script file is not found on the > backend side (on spark executors). -- This message was sent by Atlassian JIRA (v6.3.15#6346)