[ 
https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971767#comment-14971767
 ] 

Haohui Mai commented on HDFS-8766:
----------------------------------

{code}
+typedef struct hdfsFile_internal {
{code}

This will tie the C++ APIs that are proposed in HDFS-9144 with the C bindings. 
I suggest the following pattern:

{code}
namespace {
class FileHandle { ... };
}
struct hdfsFile_internal {
  FileHandle handle;
  hdfsFile_internal(FileHandle &&handle) : handle(std::move(handle)) {}
}

FileHandle *unwrap(hdfsFile f) {
  return f->handle;
}

hdfsFile wrap(FileHandle &&handle) {
  return new hdfsFile_internal(handle);
}
{code}

{code}
+  std::promise<Status> stat;
+  std::future<Status> future = stat.get_future();
+
+  // wrap async FileSystem::Open with promise to make it a blocking call
+  InputStream *input_stream = nullptr;
+  auto h = [&stat, &input_stream](const Status &s, InputStream *is) {
+    stat.set_value(s);
+    input_stream = is;
+  };
+
+  file_system_->Open(path, h);
+
+  // block until promise is set
+  auto s = future.get();
{code}

There is a race-condition here. You're effectively hitting 
https://github.com/haohui/libhdfspp/issues/31. You'll need to capture the 
promise into the closure of the handler. And please wrap them as a template to 
avoid copying code.


{code}
+  // note: order of members matters a lot here.
+  std::unique_ptr<IoService> service_;
+  // std::thread needs to join before deletion
+  struct WorkerDeleter {
+    void operator()(std::thread *t) {
+      t->join();
+      delete t;
+    }
{code}

Instead of relying the orders to ensure correctness, how about extracting them 
as functions to explicitly enforce the order?

{code}
+hdfsFile HdfsInternal::OpenFileForRead(const std::string &path) {
+bool HdfsInternal::Connect(const char *nn, tPort port, unsigned int threads) {
{code}

It makes more sense to for these methods to return a {{Status}} object (which 
contains the information about {{errno}}). The C API can set the errno based on 
the information. For example, 

{code}
+Status HdfsInternal::OpenFileForRead(const std::string &path, FileHandle 
**handle) {
{code}



> Implement a libhdfs(3) compatible API
> -------------------------------------
>
>                 Key: HDFS-8766
>                 URL: https://issues.apache.org/jira/browse/HDFS-8766
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-8766.HDFS-8707.000.patch, 
> HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, 
> HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch, 
> HDFS-8766.HDFS-8707.005.patch, HDFS-8766.HDFS-8707.006.patch, 
> HDFS-8766.HDFS-8707.007.patch, HDFS-8766.HDFS-8707.008.patch, 
> HDFS-8766.HDFS-8707.009.patch
>
>
> Add a synchronous API that is compatible with the hdfs.h header used in 
> libhdfs and libhdfs3.  This will make it possible for projects using 
> libhdfs/libhdfs3 to relink against libhdfspp with minimal changes.
> This also provides a pure C interface that can be linked against projects 
> that aren't built in C++11 mode for various reasons but use the same 
> compiler.  It also allows many other programming languages to access 
> libhdfspp through builtin FFI interfaces.
> The libhdfs API is very similar to the posix file API which makes it easier 
> for programs built using posix filesystem calls to be modified to access HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to