Great!

Sorry I missed the earlier discussion thread. Is there a target version for
this support? I assume the milestone is still in a dev branch?

On Thu, Apr 28, 2022 at 8:26 AM Gautham Banasandra <gaur...@apache.org>
wrote:

> Hi Hadoop devs,
>
> I would like to announce that we recently reached a new milestone - we
> recently finished all the tasks in item 3 under Phase 1. This implies that
> all the HDFS native client tools[1] have become cross platform now. We're
> inching closer towards making Hadoop cross platform. Watch this space for
> more updates.
>
> [1] =
>
> https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tools
>
> Thanks,
> --Gautham
>
> On Mon, 21 Feb 2022 at 00:12, Gautham Banasandra <gaur...@apache.org>
> wrote:
>
> > Hi all,
> >
> > I've been working on getting Hadoop to build on Windows for quite some
> > time now. We're now at a stage where we can parallelize the effort and
> > complete this sooner. I've outlined the parts that are remaining. Please
> > get in touch with me if anyone wishes to join hands in realizing this
> goal.
> >
> > *Why do we need Hadoop to run on Windows?*
> > Windows has a very large user base. The modern alternative softwares to
> > Hadoop (like Kubernetes) are cross platform by design. We have to
> > acknowledge the fact it isn't easy to get Hadoop running on Windows. The
> > reason why we haven't seen much adoption of Hadoop on Windows is probably
> > because of issues like compilation, requiring work-arounds every step of
> > the way etc. If we were to nail these issues, I believe it would
> > tremendously expand the usage of Hadoop.
> >
> > I plan to complete this in 4 phases.
> >
> > *Phase 1 : Building Hadoop on Windows*
> > 1. [HADOOP-17193] Compile Hadoop on Windows natively - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HADOOP-17193>
> > The Hadoop build on Windows is currently broken because of the POSIX API
> > calls made in the HDFS native client (libhdfspp). MinGW and Cygwin
> > provide POSIX implementation on Windows. While it's possible to use these
> > C++ compilers, it won't be the same as compiling Hadoop with Visual C++.
> > The Visual C++ runtime is the native C++ runtime on Windows and provides
> > much more capabilities (like core dumps etc.) than its alternatives.
> Thus,
> > it's essential to get Hadoop to compile with Visual Studio on Windows.
> > We'll be using Visual Studio 2019.
> >
> > 2. [HDFS-15843] [libhdfs++] Make write cross platform - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-15843>
> > Until recently, Hadoop was being built with C++11. I upgraded the
> compiler
> > version to a level where it supports C++17 so that we've access to
> > std::filesystem and a few other modern C++ APIs. However, there are some
> > cases where the C++17 APIs don't suffice. Thus, I wrote the XPlatform
> > library
> > <
> https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform
> >,
> > which is a collection of system call APIs implemented in a cross-platform
> > friendly manner. The CMake build system will choose the appropriate
> > platform implementation while building so that we can do away with all
> the
> > #ifdefs based on platform in the code. In summary, if you ever come
> across
> > a need to use system calls, please put them into the XPlatform library
> and
> > use its APIs.
> >
> > 3. [HDFS-16474] Make HDFS tail tool cross platform - ASF JIRA (
> apache.org)
> > <https://issues.apache.org/jira/browse/HDFS-16474>
> >     [HDFS-16473] Make HDFS stat tool cross platform - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16473>
> >     [HDFS-16472] Make HDFS setrep tool cross platform - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16472>
> >     [HDFS-16471] Make HDFS ls tool cross platform - ASF JIRA (apache.org
> )
> > <https://issues.apache.org/jira/browse/HDFS-16471>
> >     [HDFS-16470] Make HDFS find tool cross platform - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16470>
> > The HDFS native client tools use getopt API to parse the command line
> > arguments. getopt isn't available on Windows. One can follow this PR to
> > make the above tools cross platform compatible - HDFS-16285. Make HDFS
> > ownership tools cross platform by GauthamBanasandra · Pull Request #3588
> ·
> > apache/hadoop (github.com) <https://github.com/apache/hadoop/pull/3588>.
> >
> > 4. [HDFS-16463] Make dirent.h cross platform compatible - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16463>
> >     [HDFS-16465] Make usage of strings.h cross platform compatible - ASF
> > JIRA (apache.org) <https://issues.apache.org/jira/browse/HDFS-16465>
> > For these JIRAs, the header files aren't there for Windows. Thus, we need
> > to inspect the APIs that have been used from these headers and implement
> > them.
> >
> > 5. [HDFS-16464] Create only libhdfspp static libraries for Windows - ASF
> > JIRA (apache.org) <https://issues.apache.org/jira/browse/HDFS-16464>
> > There are some issues with producing Hadoop dlls on Windows. So, let's
> > plan to just deliver only static libraries in this phase.
> >
> > 6. [HDFS-16466] Implement Linux permission flags on Windows - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16466>
> > 7. [HDFS-16467] Ensure Protobuf generated headers are included first -
> > ASF JIRA (apache.org) <https://issues.apache.org/jira/browse/HDFS-16467>
> > 8. [HDFS-16468] Define ssize_t for Windows - ASF JIRA (apache.org)
> > <https://issues.apache.org/jira/browse/HDFS-16468>
> > 9. [HDFS-16469] Locate protoc-gen-hrpc.exe on Windows - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16469>
> > 10. [YARN-11078] Set env vars in a cross platform compatible way - ASF
> > JIRA (apache.org) <https://issues.apache.org/jira/browse/YARN-11078>
> >
> > *Phase 2 : Setup CI for Hadoop on Windows*
> > 1. [HADOOP-18133] Add Dockerfile for Windows 10 - ASF JIRA (apache.org)
> > <https://issues.apache.org/jira/browse/HADOOP-18133>
> > 2. [HADOOP-18134] Run CI for Windows 10 - ASF JIRA (apache.org)
> > <https://issues.apache.org/jira/browse/HADOOP-18134>
> > We really must setup the CI for Hadoop on Windows to ensure that this
> > never breaks again.
> >
> > *Phase 3 : Resolving systemic issues*
> > 1. [HADOOP-13223] winutils.exe is a bug nexus and should be killed with
> > an axe. - ASF JIRA (apache.org)
> > <https://issues.apache.org/jira/browse/HADOOP-13223>
> > The Hadoop environment is modeled closer to that of Linux than Windows.
> > Thus, we see a lot of functional gaps between running Hadoop on Linux v/s
> > Windows, which have become the source of bugs when it comes to running
> > Hadoop on Windows. One such issue is that of winutils.exe. We can aim to
> > address issues like these in this phase. I plan to provide JNI
> > implementation for each platform and unify these under a common file
> system
> > interface. So that we get stack traces for exceptions thrown in these
> > layers and mostly so that we don't have any disparity between the
> platforms.
> >
> > *Phase 4 : Produce Windows distribution of Hadoop*
> > 1. [HADOOP-18135] Produce Windows binaries of Hadoop - ASF JIRA
> > (apache.org) <https://issues.apache.org/jira/browse/HADOOP-18135>
> > The public should be able to download and install Hadoop on their Windows
> > computers.
> >
> > Thanks,
> > --Gautham
> >
>

Reply via email to