[ 
https://issues.apache.org/jira/browse/PIG-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2793:
----------------------------

    Component/s:     (was: parser)
                     (was: grunt)
                     (was: tools)
    
> Make Pig Work on Windows without Cygwin
> ---------------------------------------
>
>                 Key: PIG-2793
>                 URL: https://issues.apache.org/jira/browse/PIG-2793
>             Project: Pig
>          Issue Type: Bug
>         Environment: Windows without Cygwin as a whole, but with some key 
> binaries such as perl, diff, gawk, gzip, sed.
>            Reporter: John Gordon
>
> For pig to really work well on Windows, it needs hadoop core changes.  Right 
> now, those are in progress in branch-1-win.  For this work, I am running Pig 
> on Windows against branch-1-win and removing Cygwin dependencies as 
> capabilities open up.  Branch-1-win is fairly stable now, and has opened up 
> enough functionality to see the few things needed in Pig to run E2E on top of 
> a cross-platform Hadoop core without Cygwin.  This uber-JIRA should track the 
> whole of the work to get pig running well on Windows without Cygwin.
> There are a few types of work that I think are needed right now (will 
> break-out sub-jiras to track them):
> TEST:
> --------
> 1.) Tests that generate pig script strings with paths in them (e.g. 
> dynamically build load/store commands) need to have Pig escape ("\") 
> characters encoded -- as they can now occur in both Hadoop and local paths.
> 2.) Tests that generate local temporary files with createTempFile, and then 
> try to use those as HDFS paths need to remove ":" from the generated file 
> name to create valid HDFS paths.
> 3.) Tests that hand-generate URIs via string concatenation (e.g. "file:" + 
> strFileName) need to use Util.generateURI instead to get a valid URI for the 
> target platform.
> 4.) Tests that assume the first line in a script (e.g. #!/bin/sh) 
> auto-resolves interpreters need to explicitly call the interpreter (e.g. 
> instead of calling "perlscript.pl" they should call "perl perlscript.pl".
> 5.) Changes in quotes or command syntax between shells (e.g. " or ', dir or 
> ls) need to be tuned a little here and there.
> PROD:
> --------
> 1.) The streaming interface needs to be fixed to run without a Cygwin 
> dependency.
> 2.) The pig.additional.jars separator is currently hardcoded to ":", and 
> should be File.pathSeparator instead (":" on linux, ";" on Windows) to be 
> able to accept Windows paths (C:\file.jar for instance).
> 3.) The Grunt "sh" command highly surfaces the behavior of the exec API.  If 
> you use a built-in, it fails with file not found.  This surfaces a lot of 
> differences in shell implementation differences (e.g. ls is an exe, but dir 
> is builtin) -- and many of the cases in TestGrunt end up running (sh bash -c 
> "command").  For portability and ease of use, sh should actually exec "sh -c 
> <command> on Linux and "cmd /C <command>" on Windows to improve usability and 
> make it possible to use aliases and bat files on either platform to make the 
> interface more platform independent to end-users.
> 4.) (eventual) Update Pig's dependencies to pick up a stable Hadoop core that 
> runs on Windows from a release branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to