pchaoda opened a new issue #12743:
URL: https://github.com/apache/arrow/issues/12743


   Hi all,
   I am using pyarrow==7.0.0 to connect hdfs.
   It run well with linux,but unfortunately get error in windows.
   I have set `JAVA_HOME`, `HADOOPHOME`,`ARROW_LIBHDFS_DIR`
   JAVA_HOME=C:\Users\think\Desktop\python-SDK-green\jdk18
   
HADOOP_HOME=C:\Users\think\Downloads\hadoop-2.10.1.tar\hadoop-2.10.1\hadoop-2.10.1
   
ARROW_LIBHDFS_DIR=C:\Users\think\Desktop\python-SDK-green\hadoop_client\lib\native;C:\Users\think\Desktop\python-SDK-green\jdk18\jre\bin\server;
   when I am using pyarrow.hdfs.connect(), I am getting the error:
   ```bash
   Traceback (most recent call last):
     File "C:\Users\think\Desktop\python-SDK-green\python\test.py", line 7, in 
<module>
       data_provider = DataProvider()
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\nescqdata\MarketData\dataProvider.py",
 line 15, in __init__
       super(DataProvider, self).__init__(dfs)
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\nescqdata\baseDataProvider.py",
 line 53, in __init__
       self.dfs = pa.hdfs.connect() if dfs is None else dfs
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\pyarrow\hdfs.py",
 line 227, in connect
       return _connect(
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\pyarrow\hdfs.py",
 line 237, in _connect
       fs = HadoopFileSystem(host=host, port=port, user=user,
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\pyarrow\hdfs.py",
 line 49, in __init__
       self._connect(host, port, user, kerb_ticket, extra_conf)
     File "pyarrow\_hdfsio.pyx", line 85, in 
pyarrow._hdfsio.HadoopFileSystem._connect
     File "pyarrow\error.pxi", line 114, in pyarrow.lib.check_status
   OSError: Unable to load libjvm: �Ҳ���ָ����ģ�顣
   ```
   and by the way, before I got this error,hdfs.py was modified to avoid 
another problem by add shell=True
   ```bash
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\pyarrow\hdfs.py",
 line 145, in _maybe_set_hadoop_classpath
       classpath = _hadoop_classpath_glob(hadoop_bin)
     File 
"C:\Users\think\Desktop\python-SDK-green\python\lib\site-packages\pyarrow\hdfs.py",
 line 172, in _hadoop_classpath_glob
       return subprocess.check_output(hadoop_classpath_args)
     File "subprocess.py", line 424, in check_output
       return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
     File "subprocess.py", line 505, in run
       with Popen(*popenargs, **kwargs) as process:
     File "subprocess.py", line 951, in __init__
       self._execute_child(args, executable, preexec_fn, close_fds,
     File "subprocess.py", line 1420, in _execute_child
       hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
   OSError: [WinError 193] %1 不是有效的 Win32 应用程序。
   ```
   Thank you !!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to