Peter Slawski created PIG-4950:
----------------------------------
Summary: Fix minor issues with running scripts in non-local
FileSystems
Key: PIG-4950
URL: https://issues.apache.org/jira/browse/PIG-4950
Project: Pig
Issue Type: Bug
Affects Versions: 0.15.0, 0.16.0
Reporter: Peter Slawski
Priority: Minor
There are two similar minor issues regarding running Pig scripts located in
non-local FileSystems such as hdfs and s3.
# The first occurs when the script path is passed using the ‘-f’ option. In
this case, the script contents are not set in ScriptState. Instead a WARN
message is logged due to an IOException being thrown. This is because the
‘remote’ path is treated as a local one. Instead, the path of the downloaded
script should be passed over to ScriptState#setScript. As a result of this bug,
an empty string is set for the “pig.script” property when the Pig job runs on a
Hadoop cluster. Also, if Tez is being used, then the Dag info does not include
the script contents as it normally does when a local script is passed.
# The second issue is more minor, but #validateLogFile in the Main class is set
to use the path given by the user rather than using the downloaded local file
path. Again, #validateLogFile method treats the given path as a local one, but
this would not be the case if the user specifies a remote path. i.e. one with a
scheme such as hdfs or s3. This occurs in both cases: when the script is
specified using the ‘-f’ option or when the script is passed as the
last/remaining argument.
Both fixes to these issues are to just pass in the local downloaded path
instead. If the script path specified is a local one, then the local downloaded
path would just be that path specified.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)