I'm guessing you mean write_table? Assuming you are passing a filename / string (and not an open output stream) to write_table I would expect that any files opened during the call have been closed before the call returns.
Pedantically, this is not quite the same thing as "finished writing on disk" but more accurately, "finished writing to the OS". A power outage shortly after a call to write_table completes could lead to partial loss of a file. However, this should not matter for your case if I am understanding your problem statement in that reddit post. As long as you open that file handle to read after you have finished the call to write_table you should see all of the contents immediately. There is always the opportunity for bugs but many of our unit tests write files and then turn around and immediately read them and we don't typically have trouble here. I'm assuming your reader & writer are on the same thread & process? If you open a reader it's possible your read task is running while your write task is running and then no guarantees would be made. On Thu, Jan 6, 2022 at 12:47 PM Brandon Chinn <[email protected]> wrote: > > When `pyarrow.parquet.write_file()` returns, is the parquet file finished > writing on disk, or is it still writing? > > Context: > https://www.reddit.com/r/learnpython/comments/rxmq43/help_with_python_file_flakily_not_returning_full/hrj99tq/?context=3 > > Thanks! > Brandon Chinn
