Hi Arrow community,

Hope this email finds you well. I'm working on a project to convert
a bespoke format into parquet format, where each file contains time series
data and can be tens of gigabytes large on a daily basis.

I've successfully created a binary with parquet::StreamingWriter to convert
the file to one big parquet file.
Next I would like to 1) break it into small files - let's say 1 hour per
sub file - and 2) store them in a hive-style manner in *C++*. From the official
docs <https://arrow.apache.org/docs/cpp/tutorials/datasets_tutorial.html> I
failed to find related information. Can folks please guide where the docs
are or if it's doable right now in C++?

Best regards
Haocheng Liu


--
Best regards

Reply via email to