[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742603#comment-15742603 ] William Forson commented on HAWQ-1210: -- Hi Zhanwei Wang, Unfortunately, I don't think I will have the bandwidth to debug this further for at least a few weeks. So far, I've been using libhdfs3 as a black-box component (i.e. I've really only looked at {{hdfs.h}} and build logic), so I will have to get myself up to speed on the basic organization of the codebase, etc. However, since there is a decent chance I will be using libhdfs3 as a production dependency, in a multi-threaded environment, I would definitely like to understand what is going on here. So I will try to look into this as soon as I have the time. Thanks! > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15738862#comment-15738862 ] Zhanwei Wang commented on HAWQ-1210: Hi [~wdf] Thanks for your report. As the design, sharing FileSystem between threads should be safe, but sharing Input/OutputStream is not. If you find anything against this design, it should be consider as bug. There is concurrent read test case in current code {{test/function/TestInputStream.cpp}}, but I think it is far away from enough. Any contribution will be welcomed. You may want to take a look at the current concurrent test case to find out what we have missed. > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15738813#comment-15738813 ] William Forson commented on HAWQ-1210: -- Finally: I should note that I have some convincing (ish) evidence that using libhdfs3 concurrently in a multi-threaded environment has caused memory corruption. The manifestation is quite subtle -- so far, it has always been a segfault in unrelated library code. However, I have managed to eliminate these segfaults completely (AFAICT thus far, pending more exhaustive testing) by guarding an {{hdfsRead}} invocation (which appears in only one place in my code) with a mutex. Additionally, although the introduction of this mutex does not eliminate libhdfs3-related errors from the valgrind DRD output altogether, it does significantly reduce the number of errors reported (e.g. by a factor of 5-10). > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736565#comment-15736565 ] William Forson commented on HAWQ-1210: -- btw, to make that question _a bit_ more specific, I am particularly interested in knowing how {{hdfsFS}} handles returned by [this|https://github.com/apache/incubator-hawq/blob/master/depends/libhdfs3/src/client/hdfs.h#L151] function should be used. For instance: a) is {{hdfsFS}} construction expensive or cheap? b) can a single {{hdfsFS}} handle be safely used for concurrent {{hdfsRead}} operations? c) can distinct {{hdfsFS}} handles be safely used for concurrent {{hdfsRead}} operations (i.e. if each handle is only being used for a single read operation at any given time)? > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)