Peter Slawski created PIG-4950: ---------------------------------- Summary: Fix minor issues with running scripts in non-local FileSystems Key: PIG-4950 URL: https://issues.apache.org/jira/browse/PIG-4950 Project: Pig Issue Type: Bug Affects Versions: 0.15.0, 0.16.0 Reporter: Peter Slawski Priority: Minor
There are two similar minor issues regarding running Pig scripts located in non-local FileSystems such as hdfs and s3. # The first occurs when the script path is passed using the ‘-f’ option. In this case, the script contents are not set in ScriptState. Instead a WARN message is logged due to an IOException being thrown. This is because the ‘remote’ path is treated as a local one. Instead, the path of the downloaded script should be passed over to ScriptState#setScript. As a result of this bug, an empty string is set for the “pig.script” property when the Pig job runs on a Hadoop cluster. Also, if Tez is being used, then the Dag info does not include the script contents as it normally does when a local script is passed. # The second issue is more minor, but #validateLogFile in the Main class is set to use the path given by the user rather than using the downloaded local file path. Again, #validateLogFile method treats the given path as a local one, but this would not be the case if the user specifies a remote path. i.e. one with a scheme such as hdfs or s3. This occurs in both cases: when the script is specified using the ‘-f’ option or when the script is passed as the last/remaining argument. Both fixes to these issues are to just pass in the local downloaded path instead. If the script path specified is a local one, then the local downloaded path would just be that path specified. -- This message was sent by Atlassian JIRA (v6.3.4#6332)