[jira] [Created] (ARROW-16075) Does arrow support S3 bucket retention period setting
Sifang Li created ARROW-16075: - Summary: Does arrow support S3 bucket retention period setting Key: ARROW-16075 URL: https://issues.apache.org/jira/browse/ARROW-16075 Project: Apache Arrow Issue Type: Improvement Reporter: Sifang Li I cannot find any doc mentioning how to set the object lock (retention) period when creating a bucket (dir) via arrow's S3 support. Is there a way for doing such thing within arrow? Thanks -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARROW-15790) [C++] field's metadata is not written to Parquet file
[ https://issues.apache.org/jira/browse/ARROW-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499692#comment-17499692 ] Sifang Li commented on ARROW-15790: --- yes - that worked for me - it would be nice if they are stored automatically because I cannot imagine it would take up much space or why people would want that info dropped in any scenarios. > [C++] field's metadata is not written to Parquet file > - > > Key: ARROW-15790 > URL: https://issues.apache.org/jira/browse/ARROW-15790 > Project: Apache Arrow > Issue Type: Bug > Environment: Ubuntu >Reporter: Sifang Li >Priority: Blocker > > I used this code to test the metadata write into file and read back behavior > of parquet file: > [https://gist.github.com/dantrim/33f9f14d0b2d3ec45c022aa05f7a45ee] > > The generated file does not have metadata when I read the file in using code > below and print it out: > > {quote}std::shared_ptr infile; > PARQUET_ASSIGN_OR_THROW(infile, > arrow::io::ReadableFile::Open("./test.parquet", > arrow::default_memory_pool())); > std::unique_ptr reader; > PARQUET_THROW_NOT_OK( > parquet::arrow::OpenFile(infile, arrow::default_memory_pool(), &reader)); > std::shared_ptr table; > PARQUET_THROW_NOT_OK(reader->ReadTable(&table)); > EXPECT_EQ(frameCount, table->num_rows()); > std::cout<<"==="ToString(true) < shown{quote} > Here is the version info: > libparquet-dev/focal,now 7.0.0-1 amd64 [installed] > libparquet-glib-dev/focal,now 7.0.0-1 amd64 [installed] > libparquet-glib700/focal,now 7.0.0-1 amd64 [installed,automatic] > libparquet700/focal,now 7.0.0-1 amd64 [installed,automatic] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARROW-15790) field's metadata is not write into Parquet file
[ https://issues.apache.org/jira/browse/ARROW-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499224#comment-17499224 ] Sifang Li commented on ARROW-15790: --- I can see in writer.cc - the metadata is apparently ignored: Status Init() { returnSchemaManifest::Make(writer_->schema(), /*schema_metadata=*/nullptr, default_arrow_reader_properties(), &schema_manifest_); } > field's metadata is not write into Parquet file > --- > > Key: ARROW-15790 > URL: https://issues.apache.org/jira/browse/ARROW-15790 > Project: Apache Arrow > Issue Type: Bug > Environment: Ubuntu >Reporter: Sifang Li >Priority: Blocker > > I used this code to test the metadata write into file and read back behavior > of parquet file: > [https://gist.github.com/dantrim/33f9f14d0b2d3ec45c022aa05f7a45ee] > > The generated file does not have metadata when I read the file in using code > below and print it out: > > {quote}std::shared_ptr infile; > PARQUET_ASSIGN_OR_THROW(infile, > arrow::io::ReadableFile::Open("./test.parquet", > arrow::default_memory_pool())); > std::unique_ptr reader; > PARQUET_THROW_NOT_OK( > parquet::arrow::OpenFile(infile, arrow::default_memory_pool(), &reader)); > std::shared_ptr table; > PARQUET_THROW_NOT_OK(reader->ReadTable(&table)); > EXPECT_EQ(frameCount, table->num_rows()); > std::cout<<"==="ToString(true) < shown{quote} > Here is the version info: > libparquet-dev/focal,now 7.0.0-1 amd64 [installed] > libparquet-glib-dev/focal,now 7.0.0-1 amd64 [installed] > libparquet-glib700/focal,now 7.0.0-1 amd64 [installed,automatic] > libparquet700/focal,now 7.0.0-1 amd64 [installed,automatic] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ARROW-15790) field's metadata is not write into Parquet file
Sifang Li created ARROW-15790: - Summary: field's metadata is not write into Parquet file Key: ARROW-15790 URL: https://issues.apache.org/jira/browse/ARROW-15790 Project: Apache Arrow Issue Type: Bug Environment: Ubuntu Reporter: Sifang Li I used this code to test the metadata write into file and read back behavior of parquet file: [https://gist.github.com/dantrim/33f9f14d0b2d3ec45c022aa05f7a45ee] The generated file does not have metadata when I read the file in using code below and print it out: {quote}std::shared_ptr infile; PARQUET_ASSIGN_OR_THROW(infile, arrow::io::ReadableFile::Open("./test.parquet", arrow::default_memory_pool())); std::unique_ptr reader; PARQUET_THROW_NOT_OK( parquet::arrow::OpenFile(infile, arrow::default_memory_pool(), &reader)); std::shared_ptr table; PARQUET_THROW_NOT_OK(reader->ReadTable(&table)); EXPECT_EQ(frameCount, table->num_rows()); std::cout<<"==="ToString(true) <
[jira] [Closed] (ARROW-15780) missing header file parquet/parquet_version.h
[ https://issues.apache.org/jira/browse/ARROW-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sifang Li closed ARROW-15780. - Resolution: Not A Problem > missing header file parquet/parquet_version.h > - > > Key: ARROW-15780 > URL: https://issues.apache.org/jira/browse/ARROW-15780 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 7.0.0 > Environment: Ubuntu 20.04 >Reporter: Sifang Li >Priority: Blocker > > I am following instructions of writing a table into parquet file: > [https://arrow.apache.org/docs/cpp/parquet.html] > Need to include #include "parquet/arrow/writer.h" > Apparently one header file is missing in the src - cannot find it anywhere: > In file included from ../3rd_party/arrow/cpp/src/parquet/arrow/writer.h:24, > ... > ../3rd_party/arrow/cpp/src/parquet/properties.h:31:10: fatal error: > parquet/parquet_version.h: No such file or directory > 31 | #include "parquet/parquet_version.h" -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARROW-15780) missing header file parquet/parquet_version.h
[ https://issues.apache.org/jira/browse/ARROW-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497677#comment-17497677 ] Sifang Li commented on ARROW-15780: --- Thanks - I will close this. > missing header file parquet/parquet_version.h > - > > Key: ARROW-15780 > URL: https://issues.apache.org/jira/browse/ARROW-15780 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 7.0.0 > Environment: Ubuntu 20.04 >Reporter: Sifang Li >Priority: Blocker > > I am following instructions of writing a table into parquet file: > [https://arrow.apache.org/docs/cpp/parquet.html] > Need to include #include "parquet/arrow/writer.h" > Apparently one header file is missing in the src - cannot find it anywhere: > In file included from ../3rd_party/arrow/cpp/src/parquet/arrow/writer.h:24, > ... > ../3rd_party/arrow/cpp/src/parquet/properties.h:31:10: fatal error: > parquet/parquet_version.h: No such file or directory > 31 | #include "parquet/parquet_version.h" -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARROW-15780) missing header file parquet/parquet_version.h
[ https://issues.apache.org/jira/browse/ARROW-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497657#comment-17497657 ] Sifang Li commented on ARROW-15780: --- I just ran below: (from the manual config instructions) $ mkdir build-release $ cd build-release $ cmake .. $ make -j8 # if you have 8 CPU cores, otherwise adjust > missing header file parquet/parquet_version.h > - > > Key: ARROW-15780 > URL: https://issues.apache.org/jira/browse/ARROW-15780 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 7.0.0 > Environment: Ubuntu 20.04 >Reporter: Sifang Li >Priority: Blocker > > I am following instructions of writing a table into parquet file: > [https://arrow.apache.org/docs/cpp/parquet.html] > Need to include #include "parquet/arrow/writer.h" > Apparently one header file is missing in the src - cannot find it anywhere: > In file included from ../3rd_party/arrow/cpp/src/parquet/arrow/writer.h:24, > ... > ../3rd_party/arrow/cpp/src/parquet/properties.h:31:10: fatal error: > parquet/parquet_version.h: No such file or directory > 31 | #include "parquet/parquet_version.h" -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARROW-15780) missing header file parquet/parquet_version.h
[ https://issues.apache.org/jira/browse/ARROW-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497647#comment-17497647 ] Sifang Li commented on ARROW-15780: --- It looks like an installation issue - I followed directly to the manual instruction at: [https://github.com/apache/arrow/blob/master/docs/source/developers/cpp/building.rst] The libs are built fine in the out source dir - but the parquet_vrsion.h is missing - see it has a .in file apparently the process did not convert it to .h My cmake is 3.16.3 - is that why? > missing header file parquet/parquet_version.h > - > > Key: ARROW-15780 > URL: https://issues.apache.org/jira/browse/ARROW-15780 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 7.0.0 > Environment: Ubuntu 20.04 >Reporter: Sifang Li >Priority: Blocker > > I am following instructions of writing a table into parquet file: > [https://arrow.apache.org/docs/cpp/parquet.html] > Need to include #include "parquet/arrow/writer.h" > Apparently one header file is missing in the src - cannot find it anywhere: > In file included from ../3rd_party/arrow/cpp/src/parquet/arrow/writer.h:24, > ... > ../3rd_party/arrow/cpp/src/parquet/properties.h:31:10: fatal error: > parquet/parquet_version.h: No such file or directory > 31 | #include "parquet/parquet_version.h" -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ARROW-15780) missing header file parquet/parquet_version.h
Sifang Li created ARROW-15780: - Summary: missing header file parquet/parquet_version.h Key: ARROW-15780 URL: https://issues.apache.org/jira/browse/ARROW-15780 Project: Apache Arrow Issue Type: Bug Components: C++ Affects Versions: 7.0.0 Environment: Ubuntu 20.04 Reporter: Sifang Li I am following instructions of writing a table into parquet file: [https://arrow.apache.org/docs/cpp/parquet.html] Need to include #include "parquet/arrow/writer.h" Apparently one header file is missing in the src - cannot find it anywhere: In file included from ../3rd_party/arrow/cpp/src/parquet/arrow/writer.h:24, ... ../3rd_party/arrow/cpp/src/parquet/properties.h:31:10: fatal error: parquet/parquet_version.h: No such file or directory 31 | #include "parquet/parquet_version.h" -- This message was sent by Atlassian Jira (v8.20.1#820001)