Re: [C++] How to write parquet file in hive style using parquet::StreamingWriter

2023-03-05 Thread Haocheng Liu
Hi Weston, Thanks again for your suggestion, I was able to come up with a semi working example with an arrow write node. Now I run into two issues: - The input data has a few columns whose type is *int64_t*. It has triggered an Arrow error as "Invalid: Casting from timestamps[ns] to

Re: [C++] How to write parquet file in hive style using parquet::StreamingWriter

2023-03-02 Thread Haocheng Liu
Hey Westion, Thanks for the suggestion! I will give it a try. Best, Haocheng On Thu, Mar 2, 2023 at 2:29 PM Weston Pace wrote: > If you are working in C++ there are a few interfaces you might be > interested in. > > The simplest high level API for this would be to use Acero and create a >

Re: [C++] How to write parquet file in hive style using parquet::StreamingWriter

2023-03-02 Thread Weston Pace
If you are working in C++ there are a few interfaces you might be interested in. The simplest high level API for this would be to use Acero and create a write node. This is what pyarrow uses (though a little indirectly at the moment). There is a brief example here[1]. I'd be happy to answer

[C++] How to write parquet file in hive style using parquet::StreamingWriter

2023-03-01 Thread Haocheng Liu
Hi Arrow community, Hope this email finds you well. I'm working on a project to convert a bespoke format into parquet format, where each file contains time series data and can be tens of gigabytes large on a daily basis. I've successfully created a binary with parquet::StreamingWriter to convert