[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742603#comment-15742603 ] William Forson commented on HAWQ-1210: -- Hi Zhanwei Wang, Unfortunately, I don't think I will have the bandwidth to debug this further for at least a few weeks. So far, I've been using libhdfs3 as a black-box component (i.e. I've really only looked at {{hdfs.h}} and build logic), so I will have to get myself up to speed on the basic organization of the codebase, etc. However, since there is a decent chance I will be using libhdfs3 as a production dependency, in a multi-threaded environment, I would definitely like to understand what is going on here. So I will try to look into this as soon as I have the time. Thanks! > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738797#comment-15738797 ] William Forson edited comment on HAWQ-1210 at 12/11/16 1:26 AM: Since my own use case involves lots of RAII wrappers and extra context, I've created a standalone executable that uses libhdfs3 to repeatedly download 2 files in separate threads (see attachment). My intent is to use instances of this test program as input to dynamic analysis tools such as valgrind DRD. This is actually my first time using valgrind's DRD tool, but the output seems suggestive -- namely, even when I run my test program with {{-i 1}} (i.e. downloading each file only once), the DRD output is indicating anywhere from 35 to 60 errors (for both {{-shared}} and {{-separate}}, though {{-separate}} generally seems to produce a somewhat larger number of errors). Of course, the operative word there is "suggestive". As this is c/c++, who knows how many platform-specific variables might be coming into play here. Also, perhaps people who work on this project already have similar/better test programs. At any rate, I'm sharing this just in case it might be useful to anyone on your side. was (Author: wdf): Since my own use case involves lots of RAII wrappers and extra context, I've created a standalone executable that uses libhdfs3 to repeatedly download 2 files in separate threads. My intent is to use instances of this test program as input to dynamic analysis tools such as valgrind DRD. This is actually my first time using valgrind's DRD tool, but the output seems suggestive -- namely, even when I run my test program with {{-i 1}} (i.e. downloading each file only once), the DRD output is indicating anywhere from 35 to 60 errors (for both {{-shared}} and {{-separate}}, though {{-separate}} generally seems to produce a somewhat larger number of errors). Of course, the operative word there is "suggestive". As this is c/c++, who knows how many platform-specific variables might be coming into play here. Also, perhaps people who work on this project already have similar/better test programs. At any rate, I'm sharing this just in case it might be useful to anyone on your side. > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738813#comment-15738813 ] William Forson commented on HAWQ-1210: -- Finally: I should note that I have some convincing (ish) evidence that using libhdfs3 concurrently in a multi-threaded environment has caused memory corruption. The manifestation is quite subtle -- so far, it has always been a segfault in unrelated library code. However, I have managed to eliminate these segfaults completely (AFAICT thus far, pending more exhaustive testing) by guarding an {{hdfsRead}} invocation (which appears in only one place in my code) with a mutex. Additionally, although the introduction of this mutex does not eliminate libhdfs3-related errors from the valgrind DRD output altogether, it does significantly reduce the number of errors reported (e.g. by a factor of 5-10). > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738797#comment-15738797 ] William Forson edited comment on HAWQ-1210 at 12/11/16 1:12 AM: Since my own use case involves lots of RAII wrappers and extra context, I've created a standalone executable that uses libhdfs3 to repeatedly download 2 files in separate threads. My intent is to use instances of this test program as input to dynamic analysis tools such as valgrind DRD. This is actually my first time using valgrind's DRD tool, but the output seems suggestive -- namely, even when I run my test program with {{-i 1}} (i.e. downloading each file only once), the DRD output is indicating anywhere from 35 to 60 errors (for both {{-shared}} and {{-separate}}, though {{-separate}} generally seems to produce a somewhat larger number of errors). Of course, the operative word there is "suggestive". As this is c/c++, who knows how many platform-specific variables might be coming into play here. Also, perhaps people who work on this project already have similar/better test programs. At any rate, I'm sharing this just in case it might be useful to anyone on your side. was (Author: wdf): Since my own use case involves lots of RAII wrappers and extra context, I've created a standalone executable that uses libhdfs3 to repeatedly download 2 files in separate threads. My intent is to use instances of this test program as input to dynamic analysis tools such as valgrind DRD. This is actually my first time using valgrind's DRD tool, but the output seems suggestive -- namely, even when I run my test program with `-i 1` (i.e. downloading each file only once), the DRD output is indicating anywhere from 35 to 60 errors (for both `-shared` and `-separate`, though `-separate` generally seems to produce a somewhat larger number of errors). Of course, the operative word there is "suggestive". As this is c/c++, who knows how many platform-specific variables might be coming into play here. Also, perhaps people who work on this project already have similar/better test programs. At any rate, I'm sharing this just in case it might be useful to anyone on your side. > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] William Forson updated HAWQ-1210: - Attachment: hdfs_fs_concurrent_test.cpp Since my own use case involves lots of RAII wrappers and extra context, I've created a standalone executable that uses libhdfs3 to repeatedly download 2 files in separate threads. My intent is to use instances of this test program as input to dynamic analysis tools such as valgrind DRD. This is actually my first time using valgrind's DRD tool, but the output seems suggestive -- namely, even when I run my test program with `-i 1` (i.e. downloading each file only once), the DRD output is indicating anywhere from 35 to 60 errors (for both `-shared` and `-separate`, though `-separate` generally seems to produce a somewhat larger number of errors). Of course, the operative word there is "suggestive". As this is c/c++, who knows how many platform-specific variables might be coming into play here. Also, perhaps people who work on this project already have similar/better test programs. At any rate, I'm sharing this just in case it might be useful to anyone on your side. > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > Attachments: hdfs_fs_concurrent_test.cpp > > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736565#comment-15736565 ] William Forson commented on HAWQ-1210: -- btw, to make that question _a bit_ more specific, I am particularly interested in knowing how {{hdfsFS}} handles returned by [this|https://github.com/apache/incubator-hawq/blob/master/depends/libhdfs3/src/client/hdfs.h#L151] function should be used. For instance: a) is {{hdfsFS}} construction expensive or cheap? b) can a single {{hdfsFS}} handle be safely used for concurrent {{hdfsRead}} operations? c) can distinct {{hdfsFS}} handles be safely used for concurrent {{hdfsRead}} operations (i.e. if each handle is only being used for a single read operation at any given time)? > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/explicit synchronization -- anything > would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
[ https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] William Forson updated HAWQ-1210: - Description: Hi, I've been using libhdfs3 in a single-threaded environment for several months now, without any problems. However, as soon as I tried using the library concurrently from multiple threads: hello, segfaults. Although the source of these segfaults is annoyingly subtle, I've managed to isolate it to a relatively small block of my code that does nothing interesting aside from using libhdfs3 to download a single hdfs file. To be clear: I assume that the mistake here is mine -- that is, that I am using your library incorrectly. However, I have been unable to find any documentation as to how the libhdfs3 API _should_ be used in a multi-threaded environment. I initially interpreted this to mean, "go to town, it's all more or less thread-safe", but I am now questioning that interpretation. So, I have a question, and a request. Question: Are there any known, non-obvious concurrency gotchas regarding the usage of libhdfs3 (or whatever it's currently called)? Request: Could you please add some documentation, to the README and/or hdfs.h, regarding usage in a concurrent environment? (ideally, such notes would annotate individual components of the API in hdfs.h, but if the answer to my question above is, "No", then this could perhaps be a single sentence in the README which affirmatively states that the library is generally safe for concurrent usage without additional/explicit synchronization -- anything would be better than nothing :)) was: Hi, I've been using libhdfs3 in a single-threaded environment for several months now, without any problems. However, as soon as I tried using the library concurrently from multiple threads: hello, segfaults. Although the source of these segfaults is annoyingly subtle, I've managed to isolate it to a relatively small block of my code that does nothing interesting aside from using libhdfs3 to download a single hdfs file. To be clear: I assume that the mistake here is mine -- that is, that I am using your library incorrectly. However, I have been unable to find any documentation as to how the libhdfs3 API _should_ be used in a multi-threaded environment. I initially interpreted this to mean, "go to town, it's all more or less threadsafe", but I am now questioning that interpretation. So, I have a question, a request. Question: Are there any known, non-obvious concurrency gotchas regarding the usage of libhdfs3 (or whatever it's currently called)? Request: Could you please add some documentation, to the README and/or hdfs.h, regarding usage in a concurrent environment? (ideally, such notes would annotate individual components of the API in hdfs.h, but if the answer to my question above is, "No", then this could perhaps be a single sentence in the README which affirmatively states that the library is generally safe for concurrent usage without additional/explicit synchronization -- anything would be better than nothing :)) > Documentation regarding usage of libhdfs3 in concurrent environment > --- > > Key: HAWQ-1210 > URL: https://issues.apache.org/jira/browse/HAWQ-1210 > Project: Apache HAWQ > Issue Type: Bug > Components: libhdfs >Reporter: William Forson >Assignee: Lei Chang > > Hi, > I've been using libhdfs3 in a single-threaded environment for several months > now, without any problems. However, as soon as I tried using the library > concurrently from multiple threads: hello, segfaults. > Although the source of these segfaults is annoyingly subtle, I've managed to > isolate it to a relatively small block of my code that does nothing > interesting aside from using libhdfs3 to download a single hdfs file. > To be clear: I assume that the mistake here is mine -- that is, that I am > using your library incorrectly. However, I have been unable to find any > documentation as to how the libhdfs3 API _should_ be used in a multi-threaded > environment. I initially interpreted this to mean, "go to town, it's all more > or less thread-safe", but I am now questioning that interpretation. > So, I have a question, and a request. > Question: Are there any known, non-obvious concurrency gotchas regarding the > usage of libhdfs3 (or whatever it's currently called)? > Request: Could you please add some documentation, to the README and/or > hdfs.h, regarding usage in a concurrent environment? (ideally, such notes > would annotate individual components of the API in hdfs.h, but if the answer > to my question above is, "No", then this could perhaps be a single sentence > in the README which affirmatively states that the library is generally safe > for concurrent usage without additional/
[jira] [Created] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
William Forson created HAWQ-1210: Summary: Documentation regarding usage of libhdfs3 in concurrent environment Key: HAWQ-1210 URL: https://issues.apache.org/jira/browse/HAWQ-1210 Project: Apache HAWQ Issue Type: Bug Components: libhdfs Reporter: William Forson Assignee: Lei Chang Hi, I've been using libhdfs3 in a single-threaded environment for several months now, without any problems. However, as soon as I tried using the library concurrently from multiple threads: hello, segfaults. Although the source of these segfaults is annoyingly subtle, I've managed to isolate it to a relatively small block of my code that does nothing interesting aside from using libhdfs3 to download a single hdfs file. To be clear: I assume that the mistake here is mine -- that is, that I am using your library incorrectly. However, I have been unable to find any documentation as to how the libhdfs3 API _should_ be used in a multi-threaded environment. I initially interpreted this to mean, "go to town, it's all more or less threadsafe", but I am now questioning that interpretation. So, I have a question, a request. Question: Are there any known, non-obvious concurrency gotchas regarding the usage of libhdfs3 (or whatever it's currently called)? Request: Could you please add some documentation, to the README and/or hdfs.h, regarding usage in a concurrent environment? (ideally, such notes would annotate individual components of the API in hdfs.h, but if the answer to my question above is, "No", then this could perhaps be a single sentence in the README which affirmatively states that the library is generally safe for concurrent usage without additional/explicit synchronization -- anything would be better than nothing :)) -- This message was sent by Atlassian JIRA (v6.3.4#6332)