Great! Sorry I missed the earlier discussion thread. Is there a target version for this support? I assume the milestone is still in a dev branch?
On Thu, Apr 28, 2022 at 8:26 AM Gautham Banasandra <gaur...@apache.org> wrote: > Hi Hadoop devs, > > I would like to announce that we recently reached a new milestone - we > recently finished all the tasks in item 3 under Phase 1. This implies that > all the HDFS native client tools[1] have become cross platform now. We're > inching closer towards making Hadoop cross platform. Watch this space for > more updates. > > [1] = > > https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tools > > Thanks, > --Gautham > > On Mon, 21 Feb 2022 at 00:12, Gautham Banasandra <gaur...@apache.org> > wrote: > > > Hi all, > > > > I've been working on getting Hadoop to build on Windows for quite some > > time now. We're now at a stage where we can parallelize the effort and > > complete this sooner. I've outlined the parts that are remaining. Please > > get in touch with me if anyone wishes to join hands in realizing this > goal. > > > > *Why do we need Hadoop to run on Windows?* > > Windows has a very large user base. The modern alternative softwares to > > Hadoop (like Kubernetes) are cross platform by design. We have to > > acknowledge the fact it isn't easy to get Hadoop running on Windows. The > > reason why we haven't seen much adoption of Hadoop on Windows is probably > > because of issues like compilation, requiring work-arounds every step of > > the way etc. If we were to nail these issues, I believe it would > > tremendously expand the usage of Hadoop. > > > > I plan to complete this in 4 phases. > > > > *Phase 1 : Building Hadoop on Windows* > > 1. [HADOOP-17193] Compile Hadoop on Windows natively - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HADOOP-17193> > > The Hadoop build on Windows is currently broken because of the POSIX API > > calls made in the HDFS native client (libhdfspp). MinGW and Cygwin > > provide POSIX implementation on Windows. While it's possible to use these > > C++ compilers, it won't be the same as compiling Hadoop with Visual C++. > > The Visual C++ runtime is the native C++ runtime on Windows and provides > > much more capabilities (like core dumps etc.) than its alternatives. > Thus, > > it's essential to get Hadoop to compile with Visual Studio on Windows. > > We'll be using Visual Studio 2019. > > > > 2. [HDFS-15843] [libhdfs++] Make write cross platform - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-15843> > > Until recently, Hadoop was being built with C++11. I upgraded the > compiler > > version to a level where it supports C++17 so that we've access to > > std::filesystem and a few other modern C++ APIs. However, there are some > > cases where the C++17 APIs don't suffice. Thus, I wrote the XPlatform > > library > > < > https://github.com/apache/hadoop/tree/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform > >, > > which is a collection of system call APIs implemented in a cross-platform > > friendly manner. The CMake build system will choose the appropriate > > platform implementation while building so that we can do away with all > the > > #ifdefs based on platform in the code. In summary, if you ever come > across > > a need to use system calls, please put them into the XPlatform library > and > > use its APIs. > > > > 3. [HDFS-16474] Make HDFS tail tool cross platform - ASF JIRA ( > apache.org) > > <https://issues.apache.org/jira/browse/HDFS-16474> > > [HDFS-16473] Make HDFS stat tool cross platform - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16473> > > [HDFS-16472] Make HDFS setrep tool cross platform - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16472> > > [HDFS-16471] Make HDFS ls tool cross platform - ASF JIRA (apache.org > ) > > <https://issues.apache.org/jira/browse/HDFS-16471> > > [HDFS-16470] Make HDFS find tool cross platform - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16470> > > The HDFS native client tools use getopt API to parse the command line > > arguments. getopt isn't available on Windows. One can follow this PR to > > make the above tools cross platform compatible - HDFS-16285. Make HDFS > > ownership tools cross platform by GauthamBanasandra · Pull Request #3588 > · > > apache/hadoop (github.com) <https://github.com/apache/hadoop/pull/3588>. > > > > 4. [HDFS-16463] Make dirent.h cross platform compatible - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16463> > > [HDFS-16465] Make usage of strings.h cross platform compatible - ASF > > JIRA (apache.org) <https://issues.apache.org/jira/browse/HDFS-16465> > > For these JIRAs, the header files aren't there for Windows. Thus, we need > > to inspect the APIs that have been used from these headers and implement > > them. > > > > 5. [HDFS-16464] Create only libhdfspp static libraries for Windows - ASF > > JIRA (apache.org) <https://issues.apache.org/jira/browse/HDFS-16464> > > There are some issues with producing Hadoop dlls on Windows. So, let's > > plan to just deliver only static libraries in this phase. > > > > 6. [HDFS-16466] Implement Linux permission flags on Windows - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16466> > > 7. [HDFS-16467] Ensure Protobuf generated headers are included first - > > ASF JIRA (apache.org) <https://issues.apache.org/jira/browse/HDFS-16467> > > 8. [HDFS-16468] Define ssize_t for Windows - ASF JIRA (apache.org) > > <https://issues.apache.org/jira/browse/HDFS-16468> > > 9. [HDFS-16469] Locate protoc-gen-hrpc.exe on Windows - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HDFS-16469> > > 10. [YARN-11078] Set env vars in a cross platform compatible way - ASF > > JIRA (apache.org) <https://issues.apache.org/jira/browse/YARN-11078> > > > > *Phase 2 : Setup CI for Hadoop on Windows* > > 1. [HADOOP-18133] Add Dockerfile for Windows 10 - ASF JIRA (apache.org) > > <https://issues.apache.org/jira/browse/HADOOP-18133> > > 2. [HADOOP-18134] Run CI for Windows 10 - ASF JIRA (apache.org) > > <https://issues.apache.org/jira/browse/HADOOP-18134> > > We really must setup the CI for Hadoop on Windows to ensure that this > > never breaks again. > > > > *Phase 3 : Resolving systemic issues* > > 1. [HADOOP-13223] winutils.exe is a bug nexus and should be killed with > > an axe. - ASF JIRA (apache.org) > > <https://issues.apache.org/jira/browse/HADOOP-13223> > > The Hadoop environment is modeled closer to that of Linux than Windows. > > Thus, we see a lot of functional gaps between running Hadoop on Linux v/s > > Windows, which have become the source of bugs when it comes to running > > Hadoop on Windows. One such issue is that of winutils.exe. We can aim to > > address issues like these in this phase. I plan to provide JNI > > implementation for each platform and unify these under a common file > system > > interface. So that we get stack traces for exceptions thrown in these > > layers and mostly so that we don't have any disparity between the > platforms. > > > > *Phase 4 : Produce Windows distribution of Hadoop* > > 1. [HADOOP-18135] Produce Windows binaries of Hadoop - ASF JIRA > > (apache.org) <https://issues.apache.org/jira/browse/HADOOP-18135> > > The public should be able to download and install Hadoop on their Windows > > computers. > > > > Thanks, > > --Gautham > > >