[
https://issues.apache.org/jira/browse/PARQUET-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752675#comment-15752675
]
Wes McKinney commented on PARQUET-799:
--------------------------------------
They're safe to use for *some* production applications, just not concurrent
ones where the same file handle is accessed repeatedly. What I was trying to
say above is that parquet-cpp doesn't feel like the best place to maintain a
general cross-platform, concurrency-safe IO interface -- for single-threaded,
POSIX-like systems, it should be OK.
> concurrent usage of the file reader API
> ---------------------------------------
>
> Key: PARQUET-799
> URL: https://issues.apache.org/jira/browse/PARQUET-799
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Reporter: William Forson
>
> I've recently been debugging a segfault that occurs when concurrently reading
> (distinct) parquet files from multiple threads.
> I initially assumed this was a reasonable thing to do, since the project
> README doesn't say anything about concurrency one way or the other. But then
> I encountered [this TODO
> comment|https://github.com/apache/parquet-cpp/blob/master/src/parquet/column/page.h#L35]:
> {quote}
> // TODO: Parallel processing is not yet safe because of memory-ownership
> // semantics (the PageReader may or may not own the memory referenced by a
> // page)
> {quote}
> And it has got me wondering: is parquet-cpp fundamentally NOT thread-safe,
> even for the use case of reading a single file per thread at any given time?
> Or is it basically thread-safe with a couple gotchas?
> Also, jfyi, I'm currently running against a build which incorporates [this
> change|https://github.com/apache/parquet-cpp/commit/002466539f6aba7bf1f885b66f61f302ed88fa6b].
> (aside: my motivation for recently posting an issue re. {{THRIFT_HOME}} was
> to rule out any ABI weirdness that might result from building parquet-cpp
> against a different version of thrift than the applications that ultimately
> consume parquet-cpp)
> Thanks!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)