westonpace commented on PR #14199:
URL: https://github.com/apache/arrow/pull/14199#issuecomment-1256634727

   > Is the key change here adding an explicit dependency from directory 
creation to the file queue creation?
   
   There was always a dependency but the old algorithm was roughly...
   
   ```
   void StartFile(filename) {
     if (!directory_created) {
       directory_created = false;
       CreateDirectory();
     }
     DoStartFile();
   }
   ```
   
   ...and the new algorithm is now...
   
   ```
   void StartFile(filename) {
     if (!directory_created) {
       if (!creating_directory) {
         creating_directory = true;
         CreateDirectory();
         directory_cv.notify_all();
       } else {
         directory_cv.wait([&] {return directory_created;});
       }
     }
     DoStartFile();
   ```
   
   The problem was that it was possible to start many file queues for a 
directory (e.g. if the max rows per file was very small like it is in this unit 
test) and they all need to block on directory creation (not just the first one).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to