Answers below:
(1) the word: each-file ( path bfs? quot -- )
> Does it handle a file successively by quot without first gathering all
> the file-paths in the path ?
>
It has a queue of paths to process, for each directory it pushes all the
paths into the queue, and then for each one it calls the quotation on, and
if its a directory, recurses to handle that directory contents (depth-first
or breadth-first). So, in your case with a couple million files in a
single directory, it would make a queue of a couple million, then process
it. Now, if your million files are in a tree of directories, then it would
be more efficient because it would list a directory into the queue, then
continue processing.
> (2) what is the idiomatic way to get the total file size for all files
> in a folder (and its sub-folders) ?
> Using each-file in (1), I am forced to set up a global 'variable'
> called
> total-size.
>
Does this not work?
0 "/path/to/directory" t [ link-info size>> + ] each-file
--
>
> If I would to process a big file to collect some info, I could write:
>
> "path-to-file" ascii [ V{ } [ quot ] each-line ]
> with-file-reader
>
> The collected info is on the stack after the above finishes.
>
>
>
> To go through a huge directory (folder),
> do you know if the current factor can set up something similar ?
>
I'm not sure what you're asking - unless you're wondering if a directory
will millions of files are efficiently handled iteratively, and the answer
is somewhat, since we currently list all then process.
If you look at how ``(directory-entries)`` is implemented, we could easily
build a word that applies a quotation to each entry in an iterative fashion
without making the sequence of all entries first. If that's important for
your performance...
Thanks,
John.
--
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk