Perfect! Thank you for that. I had not found the ParquetFileWriter class. Is there an equivalent feather class?
On Fri, Jul 28, 2023 at 7:59 AM Nic Crane <[email protected]> wrote: > Hi Richard, > > It is possible - I've created an example in this gist showing how to loop > through a list of files and write to a Parquet file one row at a time: > https://gist.github.com/thisisnic/5bdb85d2742bc318433f2f14b8bd77cf. > > Does this solve your problem? > > On Thu, 27 Jul 2023 at 12:22, Richard Beare <[email protected]> > wrote: > >> Hi arrow experts, >> >> I have what I think should be a standard problem, but I'm not seeing the >> correct solution. >> >> I have data in a nonstandard form (nifti neuroimaging files) that I can >> load into R and transform into a single row dataframe (which is 30K >> columns). In a small example I can load about 80 of these into a single >> dataframe and save as feather or parquet without problem. I'd like to >> address the problem where I have thousands. >> >> The approach of loading a collection (e.g. 10) into a dataframe and >> saving with a hive standard name and repeating does work, but doesn't seem >> like the right way to do it. >> >> Is there a way to stream data, one row at a time, into a feather or >> parquet file? >> I've attempted to use write_feather with a FileOutputputStream sink, but >> without luch >> >
