[ https://issues.apache.org/jira/browse/HDFS-11529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sailesh Mukil updated HDFS-11529: --------------------------------- Description: libHDFS uses a table to compare exceptions against and returns a corresponding error code to the application in case of an error. However, this table is manually populated and many times is disremembered when new exceptions are added. This causes libHDFS to return EINTERNAL (or Unknown Error(255)) whenever these exceptions are hit. These are some examples of exceptions that have been observed on an Error(255): org.apache.hadoop.ipc.StandbyException (Operation category WRITE is not supported in state standby) java.io.EOFException: Cannot seek after EOF javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) It is of course not possible to have an error code for each and every type of exception, so one suggestion of how this can be addressed is by having a call such as hdfsGetLastException() that would return the last exception that a libHDFS thread encountered. This way, an application may choose to call hdfsGetLastException() if it receives EINTERNAL. We can make use of the Thread Local Storage to store this information. Also, this makes sure that the current functionality is preserved. This is a follow up from HDFS-4997. was: libHDFS uses a table to compare exceptions against and returns a corresponding error code to the application in case of an error. However, this table is manually populated and many times is disremembered when new exceptions are added. This causes libHDFS to return EINTERNAL (or Unknown Error(255)) whenever these exceptions are hit. These are some examples of exceptions that have been observed on an Error(255): org.apache.hadoop.ipc.StandbyException (Operation category WRITE is not supported in state standby) java.io.EOFException: Cannot seek after EOF javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) It is of course not possible to have an error code for each and every type of exception, so one suggestion of how this can be addressed is by having a call such as hdfsGetLastException() that would return the last exception that a libHDFS thread encountered. We can make use of the Thread Local Storage to store this information. Also, this makes sure that the current functionality is preserved. This is a follow up from HDFS-4997. > libHDFS still does not return appropriate error information in many cases > ------------------------------------------------------------------------- > > Key: HDFS-11529 > URL: https://issues.apache.org/jira/browse/HDFS-11529 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs > Affects Versions: 2.6.0 > Reporter: Sailesh Mukil > Priority: Critical > Labels: errorhandling, libhdfs > > libHDFS uses a table to compare exceptions against and returns a > corresponding error code to the application in case of an error. > However, this table is manually populated and many times is disremembered > when new exceptions are added. > This causes libHDFS to return EINTERNAL (or Unknown Error(255)) whenever > these exceptions are hit. These are some examples of exceptions that have > been observed on an Error(255): > org.apache.hadoop.ipc.StandbyException (Operation category WRITE is not > supported in state standby) > java.io.EOFException: Cannot seek after EOF > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > It is of course not possible to have an error code for each and every type of > exception, so one suggestion of how this can be addressed is by having a call > such as hdfsGetLastException() that would return the last exception that a > libHDFS thread encountered. This way, an application may choose to call > hdfsGetLastException() if it receives EINTERNAL. > We can make use of the Thread Local Storage to store this information. Also, > this makes sure that the current functionality is preserved. > This is a follow up from HDFS-4997. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org