[jira] [Updated] (HADOOP-13223) winutils.exe is a bug nexus and should be killed with an axe.

2018-06-05 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13223:

Issue Type: Sub-task  (was: Improvement)
Parent: HADOOP-15461

> winutils.exe is a bug nexus and should be killed with an axe.
> -
>
> Key: HADOOP-13223
> URL: https://issues.apache.org/jira/browse/HADOOP-13223
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: bin
>Affects Versions: 2.6.0
> Environment: Microsoft Windows, all versions
>Reporter: john lilley
>Priority: Major
>
> winutils.exe was apparently created as a stopgap measure to allow Hadoop to 
> "work" on Windows platforms, because the NativeIO libraries aren't 
> implemented there (edit: even NativeIO probably doesn't cover the operations 
> that winutils.exe is used for).  Rather than building a DLL that makes native 
> OS calls, the creators of winutils.exe must have decided that it would be 
> more expedient to create an EXE to carry out file system operations in a 
> linux-like fashion.  Unfortunately, like many stopgap measures in software, 
> this one has persisted well beyond its expected lifetime and usefulness.  My 
> team creates software that runs on Windows and Linux, and winutils.exe is 
> probably responsible for 20% of all issues we encounter, both during 
> development and in the field.
> Problem #1 with winutils.exe is that it is simply missing from many popular 
> distros and/or the client-side software installation for said distros, when 
> supplied, fails to install winutils.exe.  Thus, as software developers, we 
> are forced to pick one version and distribute and install it with our 
> software.
> Which leads to problem #2: winutils.exe are not always compatible.  In 
> particular, MapR MUST have its winutils.exe in the system path, but doing so 
> breaks the Hadoop distro for every other Hadoop vendor.  This makes creating 
> and maintaining test environments that work with all of the Hadoop distros we 
> want to test unnecessarily tedious and error-prone.
> Problem #3 is that the mechanism by which you inform the Hadoop client 
> software where to find winutils.exe is poorly documented and fragile.  First, 
> it can be in the PATH.  If it is in the PATH, that is where it is found.  
> However, the documentation, such as it is, makes no mention of this, and 
> instead says that you should set the HADOOP_HOME environment variable, which 
> does NOT override the winutils.exe found in your system PATH.
> Which leads to problem #4: There is no logging that says where winutils.exe 
> was actually found and loaded.  Because of this, fixing problems of finding 
> the wrong winutils.exe are extremely difficult.
> Problem #5 is that most of the time, such as when accessing straight up HDFS 
> and YARN, one does not *need* winutils.exe.  But if it is missing, the log 
> messages complain about its absence.  When we are trying to diagnose an 
> obscure issue in Hadoop (of which there are many), the presence of this red 
> herring leads to all sorts of time wasted until someone on the team points 
> out that winutils.exe is not the problem, at least not this time.
> Problem #6 is that errors and stack traces from issues involving winutils.exe 
> are not helpful.  The Java stack trace ends at the ProcessBuilder call.  Only 
> through bitter experience is one able to connect the dots from 
> "ProcessBuilder is the last thing on the stack" to "something is wrong with 
> winutils.exe".
> Note that none of these involve running Hadoop on Windows.  They are only 
> encountered when using Hadoop client libraries to access a cluster from 
> Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13223) winutils.exe is a bug nexus and should be killed with an axe.

2016-06-01 Thread john lilley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

john lilley updated HADOOP-13223:
-
Description: 
winutils.exe was apparently created as a stopgap measure to allow Hadoop to 
"work" on Windows platforms, because the NativeIO libraries aren't implemented 
there (edit: even NativeIO probably doesn't cover the operations that 
winutils.exe is used for).  Rather than building a DLL that makes native OS 
calls, the creators of winutils.exe must have decided that it would be more 
expedient to create an EXE to carry out file system operations in a linux-like 
fashion.  Unfortunately, like many stopgap measures in software, this one has 
persisted well beyond its expected lifetime and usefulness.  My team creates 
software that runs on Windows and Linux, and winutils.exe is probably 
responsible for 20% of all issues we encounter, both during development and in 
the field.

Problem #1 with winutils.exe is that it is simply missing from many popular 
distros and/or the client-side software installation for said distros, when 
supplied, fails to install winutils.exe.  Thus, as software developers, we are 
forced to pick one version and distribute and install it with our software.

Which leads to problem #2: winutils.exe are not always compatible.  In 
particular, MapR MUST have its winutils.exe in the system path, but doing so 
breaks the Hadoop distro for every other Hadoop vendor.  This makes creating 
and maintaining test environments that work with all of the Hadoop distros we 
want to test unnecessarily tedious and error-prone.

Problem #3 is that the mechanism by which you inform the Hadoop client software 
where to find winutils.exe is poorly documented and fragile.  First, it can be 
in the PATH.  If it is in the PATH, that is where it is found.  However, the 
documentation, such as it is, makes no mention of this, and instead says that 
you should set the HADOOP_HOME environment variable, which does NOT override 
the winutils.exe found in your system PATH.

Which leads to problem #4: There is no logging that says where winutils.exe was 
actually found and loaded.  Because of this, fixing problems of finding the 
wrong winutils.exe are extremely difficult.

Problem #5 is that most of the time, such as when accessing straight up HDFS 
and YARN, one does not *need* winutils.exe.  But if it is missing, the log 
messages complain about its absence.  When we are trying to diagnose an obscure 
issue in Hadoop (of which there are many), the presence of this red herring 
leads to all sorts of time wasted until someone on the team points out that 
winutils.exe is not the problem, at least not this time.

Problem #6 is that errors and stack traces from issues involving winutils.exe 
are not helpful.  The Java stack trace ends at the ProcessBuilder call.  Only 
through bitter experience is one able to connect the dots from "ProcessBuilder 
is the last thing on the stack" to "something is wrong with winutils.exe".

Note that none of these involve running Hadoop on Windows.  They are only 
encountered when using Hadoop client libraries to access a cluster from Windows.

  was:
winutils.exe was apparently created as a stopgap measure to allow Hadoop to 
"work" on Windows platforms, because the NativeIO libraries aren't implemented 
there.  Rather than building a DLL that makes native OS calls, the creators of 
winutils.exe must have decided that it would be more expedient to create an EXE 
to carry out file system operations in a linux-like fashion.  Unfortunately, 
like many stopgap measures in software, this one has persisted well beyond its 
expected lifetime and usefulness.  My team creates software that runs on 
Windows and Linux, and winutils.exe is probably responsible for 20% of all 
issues we encounter, both during development and in the field.

Problem #1 with winutils.exe is that it is simply missing from many popular 
distros and/or the client-side software installation for said distros, when 
supplied, fails to install winutils.exe.  Thus, as software developers, we are 
forced to pick one version and distribute and install it with our software.

Which leads to problem #2: winutils.exe are not always compatible.  In 
particular, MapR MUST have its winutils.exe in the system path, but doing so 
breaks the Hadoop distro for every other Hadoop vendor.  This makes creating 
and maintaining test environments that work with all of the Hadoop distros we 
want to test unnecessarily tedious and error-prone.

Problem #3 is that the mechanism by which you inform the Hadoop client software 
where to find winutils.exe is poorly documented and fragile.  First, it can be 
in the PATH.  If it is in the PATH, that is where it is found.  However, the 
documentation, such as it is, makes no mention of this, and instead says that 
you should set the HADOOP_HOME environment variable, which does NOT override 
the 

[jira] [Updated] (HADOOP-13223) winutils.exe is a bug nexus and should be killed with an axe.

2016-06-01 Thread john lilley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

john lilley updated HADOOP-13223:
-
Summary: winutils.exe is a bug nexus and should be killed with an axe.  
(was: winutils.exe is an abomination and should be killed with an axe.)

> winutils.exe is a bug nexus and should be killed with an axe.
> -
>
> Key: HADOOP-13223
> URL: https://issues.apache.org/jira/browse/HADOOP-13223
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: bin
>Affects Versions: 2.6.0
> Environment: Microsoft Windows, all versions
>Reporter: john lilley
>
> winutils.exe was apparently created as a stopgap measure to allow Hadoop to 
> "work" on Windows platforms, because the NativeIO libraries aren't 
> implemented there.  Rather than building a DLL that makes native OS calls, 
> the creators of winutils.exe must have decided that it would be more 
> expedient to create an EXE to carry out file system operations in a 
> linux-like fashion.  Unfortunately, like many stopgap measures in software, 
> this one has persisted well beyond its expected lifetime and usefulness.  My 
> team creates software that runs on Windows and Linux, and winutils.exe is 
> probably responsible for 20% of all issues we encounter, both during 
> development and in the field.
> Problem #1 with winutils.exe is that it is simply missing from many popular 
> distros and/or the client-side software installation for said distros, when 
> supplied, fails to install winutils.exe.  Thus, as software developers, we 
> are forced to pick one version and distribute and install it with our 
> software.
> Which leads to problem #2: winutils.exe are not always compatible.  In 
> particular, MapR MUST have its winutils.exe in the system path, but doing so 
> breaks the Hadoop distro for every other Hadoop vendor.  This makes creating 
> and maintaining test environments that work with all of the Hadoop distros we 
> want to test unnecessarily tedious and error-prone.
> Problem #3 is that the mechanism by which you inform the Hadoop client 
> software where to find winutils.exe is poorly documented and fragile.  First, 
> it can be in the PATH.  If it is in the PATH, that is where it is found.  
> However, the documentation, such as it is, makes no mention of this, and 
> instead says that you should set the HADOOP_HOME environment variable, which 
> does NOT override the winutils.exe found in your system PATH.
> Which leads to problem #4: There is no logging that says where winutils.exe 
> was actually found and loaded.  Because of this, fixing problems of finding 
> the wrong winutils.exe are extremely difficult.
> Problem #5 is that most of the time, such as when accessing straight up HDFS 
> and YARN, one does not *need* winutils.exe.  But if it is missing, the log 
> messages complain about its absence.  When we are trying to diagnose an 
> obscure issue in Hadoop (of which there are many), the presence of this red 
> herring leads to all sorts of time wasted until someone on the team points 
> out that winutils.exe is not the problem, at least not this time.
> Problem #6 is that errors and stack traces from issues involving winutils.exe 
> are not helpful.  The Java stack trace ends at the ProcessBuilder call.  Only 
> through bitter experience is one able to connect the dots from 
> "ProcessBuilder is the last thing on the stack" to "something is wrong with 
> winutils.exe".
> Note that none of these involve running Hadoop on Windows.  They are only 
> encountered when using Hadoop client libraries to access a cluster from 
> Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org