[ https://issues.apache.org/jira/browse/ARROW-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519140#comment-17519140 ]
Rok Mihevc commented on ARROW-16147: ------------------------------------ [~emkornfield] > [C++] ParquetFileWriter doesn't call sink_.Close when using > GcsRandomAccessFile > ------------------------------------------------------------------------------- > > Key: ARROW-16147 > URL: https://issues.apache.org/jira/browse/ARROW-16147 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Reporter: Rok Mihevc > Priority: Major > Labels: GCP > > On parquet::arrow::FileWriter::Close the underlying sink is not closed. The > implementation goes to FileSerializer::Close: > {code:cpp} > void Close() override { > if (is_open_) { > // If any functions here raise an exception, we set is_open_ to be false > // so that this does not get called again (possibly causing segfault) > is_open_ = false; > if (row_group_writer_) { > num_rows_ += row_group_writer_->num_rows(); > row_group_writer_->Close(); > } > row_group_writer_.reset(); > // Write magic bytes and metadata > auto file_encryption_properties = > properties_->file_encryption_properties(); > if (file_encryption_properties == nullptr) { // Non encrypted file. > file_metadata_ = metadata_->Finish(); > WriteFileMetaData(*file_metadata_, sink_.get()); > } else { // Encrypted file > CloseEncryptedFile(file_encryption_properties); > } > } > } > {code} > It doesn't call sink_->Close(), which leads to resource leaking and bugs. > With files (they have own close() in destructor) it works fine, but doesn't > work with fs::GcsRandomAccessFile. When I calling > parquet::arrow::FileWriter::Close the data is not flushed to storage, until > manual close of a sink stream (or stack space change). > Is it done by intention or a bug? -- This message was sent by Atlassian Jira (v8.20.1#820001)