> On 11 Dec 2015, at 05:14, michael_han <michael_...@hotmail.com> wrote:
> 
> Hi Sarala,
> I found the reason, it's because when spark run it still needs Hadoop
> support, I think it's a bug in Spark and still not fixed now ;)
> 

It's related to how the hadoop filesystem apis are used to access pretty much 
every filesystem out there, and for local "file://" filesystems, to give you 
unix-like fs permissions as well as linking. It's a pain, but not something 
that can be removed.

For windows, that hadoop native stuff is critical; on Linux you can get away 
without it when playing with spark, but in production having the hadoop native 
libraries are critical for performance, especially when working with compressed 
code. And with encryption and erasure coding, that problem isn't going to go 
away

view it less a bug and more "evidence that you can't get away from low-level 
native code, not if you want performance or OS integration"

> After I download winutils.exe and following the steps from bellow
> workaround, it works fine:
> http://qnalist.com/questions/4994960/run-spark-unit-test-on-windows-7
> 
> 

see also: https://wiki.apache.org/hadoop/WindowsProblems

Now, one thing that could help would be for someone to extend the spark build 
with a standalone-windows package, including the native libs for the same 
version of hadoop that spark was built with. That could a nice little project 
for someone to work on: something your fellow windows users will appreciate...


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to