[
https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252541#comment-14252541
]
Colin Patrick McCabe edited comment on HDFS-7018 at 12/18/14 11:31 PM:
-----------------------------------------------------------------------
Please don't make another copy of {{hdfs.h}} in the source tree. This will
lead to the various copies getting out of sync over time, which would be very
bad. Instead, just reference the existing copy via a relative path.
You can add your {{hdfsGetLastError}} function to this file, and just have a
dummy implementation that returns "unknown error" in all cases for
{{libwebhdfs}} and {{libhdfs}}. We can improve this in a follow-on JIRA.
{code}
39 #ifdef __cplusplus
40 extern "C" {
41 #endif
{code}
While this is needed in {{hdfs.h}}, it is not needed in your {{Hdfs.cc}} code.
The C\+\+ linker is smart enough to figure out that the prototypes it is seeing
correspond to the prototypes in the {{hdfs.h}} file you included.
{code}
45 static THREAD_LOCAL const char *ErrorMessage = NULL;
46 static THREAD_LOCAL std::string *ErrorMessageBuffer = NULL;
47 static THREAD_LOCAL hdfs::internal::once_flag once;
48
49 static void CreateMessageBuffer() {
50 ErrorMessageBuffer = new std::string;
51 }
{code}
I don't think we need all this. Making the thread-local buffer a pointer to a
{{std::string}} means that we have to check {{once_flag}} before we access it,
which is inefficient. It also means that if the thread exits, this memory will
be leaked (unless you set up a POSIX "thread destructor," which is complicated
and platform-specific).
Instead, let's just have a char\[128\] buffer for each thread. As an added
bonus, because this utilitizes pre-allocation, it handles the case where you
can't allocate memory for the error string itself, which you have said in the
past that you care about.
{code}
158 private:
159 bool input;
160 void *stream;
161 };
{code}
Please don't use {{void*}} here. It is not typesafe.
You can clearly see that FS objects and file objects have a concrete type,
spelled out in {{hdfs.h}}:
{code}
struct hdfs_internal;
typedef struct hdfs_internal* hdfsFS;
struct hdfsFile_internal;
typedef struct hdfsFile_internal* hdfsFile;
{code}
All of these functions need to have a {{catch (...)}} which sets the error
message to "unknown" and returns {{EINTERNAL}}. The reason is that if you
attempt to throw a C\+\+ exception through a C API, the program will abort
(technically, {{std::terminate}} will be called). I realize you probably think
you have caught all possible exceptions, but since this is C\+\+, we can never
really be sure without the {{catch (...)}}
P.S. thanks for working on this!
was (Author: cmccabe):
Please don't make another copy of {{hdfs.h}} in the source tree. This will
lead to the various copies getting out of sync over time, which would be very
bad. Instead, just reference the existing copy via a relative path.
You can add your {{hdfsGetLastError}} function to this file, and just have a
dummy implementation that returns "unknown error" in all cases for
{{libwebhdfs}} and {{libhdfs}}. We can improve this in a follow-on JIRA.
{code}
39 #ifdef __cplusplus
40 extern "C" {
41 #endif
{code}
While this is needed in {{hdfs.h}}, it is not needed in your {{Hdfs.cc}} code.
The C\+\+ linker is smart enough to figure out that the prototypes it is seeing
correspond to the prototypes in the {{hdfs.h}} file you included.
{code}
45 static THREAD_LOCAL const char *ErrorMessage = NULL;
46 static THREAD_LOCAL std::string *ErrorMessageBuffer = NULL;
47 static THREAD_LOCAL hdfs::internal::once_flag once;
48
49 static void CreateMessageBuffer() {
50 ErrorMessageBuffer = new std::string;
51 }
{code}
I don't think we need all this. Making the thread-local buffer a pointer to a
{{std::string}} means that we have to check {{once_flag}} before we access it,
which is inefficient. It also means that if the thread exits, this memory will
be leaked (unless you set up a POSIX "thread destructor," which is complicated
and platform-specific).
Instead, let's just have a char\[128\] buffer for each thread. As an added
bonus, because this utilitizes pre-allocation, it handles the case where you
can't allocate memory for the error string itself, which you have said in the
past that you care about.
{code}
158 private:
159 bool input;
160 void *stream;
161 };
{code}
Please don't use {{void*}} here. It is not typesafe.
You can clearly see that FS objects and file objects have a concrete type,
spelled out in {{hdfs.h}}:
{code}
struct hdfs_internal;
typedef struct hdfs_internal* hdfsFS;
struct hdfsFile_internal;
typedef struct hdfsFile_internal* hdfsFile;
{code}
All of these functions need to have a {{catch (...)}} which sets the error
message to "unknown" and returns {{EINTERNAL}}. The reason is that if you
attempt to throw a C\+\+ exception through a C API, the program will abort
(technically, {{std::terminate}} will be called). I realize you probably think
you have caught all possible exceptions, but since this is C\+\+, we can never
really be sure without the {{catch (...)}}
> Implement C interface for libhdfs3
> ----------------------------------
>
> Key: HDFS-7018
> URL: https://issues.apache.org/jira/browse/HDFS-7018
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Zhanwei Wang
> Assignee: Zhanwei Wang
> Attachments: HDFS-7018-pnative.002.patch, HDFS-7018.patch
>
>
> Implement C interface for libhdfs3
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)