[ https://issues.apache.org/jira/browse/ARROW-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723634#comment-16723634 ]
Wes McKinney commented on ARROW-3957: ------------------------------------- I did this in https://github.com/apache/arrow/pull/3209 The error message now looks like {code} ArrowIOError: HDFS list directory of / failed, errno: 255 (Unknown error 255) Please check that you are connecting to the correct HDFS RPC port {code} > [Python] pyarrow.hdfs.connect fails silently > -------------------------------------------- > > Key: ARROW-3957 > URL: https://issues.apache.org/jira/browse/ARROW-3957 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.11.1 > Environment: centos 7 > Reporter: Jim Fulton > Assignee: Wes McKinney > Priority: Major > Labels: hdfs > Fix For: 0.12.0 > > > I'm trying to connect to HDFS using libhdfs and Kerberos. > I have JAVA_HOME and HADOOP_HOME set and {{pyarrow.hdfs.connect}} sets > CLASSPATH correctly. > My connect call looks like: > {{import pyarrow.hdfs}} > {{c = pyarrow.hdfs.connect(host='MYHOST', port=42424,}} > {{ user='ME', kerb_ticket="/tmp/krb5cc_498970")}} > This doesn't error but the resulting connection can't do anything. They > either error like this: > {{ArrowIOError: HDFS list directory failed, errno: 255 (Unknown error 255) }} > Or swallow errors (e.g. {{exists}} returning {{False}}). > Note that {{connect}} errors if the host is wrong but doesn't error if the > port, user, or kerb_ticket are wrong. I have no idea how to debug this, > because no useful errors. > Note that I _can_ connect using the hdfs Python package. (Of course, that > doesn't provide the API I need to read Parquet files.). > Any help would be appreciated greatly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)