2015-10-02 20:12 GMT+02:00 HP wei <hpwe...@gmail.com>: > First, > In factor's listener terminal (not in the gui window, though), > Jon Harper suggested to hit Control-C and t to terminate > a long running code. > I hit Control-C in below case (1), it brings out a low level debugger (what > a pleasant surprise). > > Let me ask a question first before I write more about investigating the > issue. > *** in the low-level debugger, one of the commands is 'data' to dump data > heap. > Is there any way to dump the result to a file ??
No. But you can easily log the console output: ./factor -run=readline-listener |& tee -i out.log > Summary of further investigation. > > The code > 0 "a_path_to_big_folder" x [ link-info dup symbolic-link? [ drop ] [ size>> > + ] if ] each-file I believe this code is a rough example on how to do it. To count disk usage in a real Linux directory tree is much more involved than that. You need to account for hard links, virtual file systems, volatile files and much more. Look at all switches "man du" lists -- it is complicated. > (1) when x = t (breadth-first BFS) > the memory usage reported by linux's 'top' shows steady increase > from around 190M to as high as 2GB before either I killed it or it hit > tge > missing file issue. I don't think you are hitting a missing file issue. In /proc/<factor-pid>/fd there is an extra ephemeral file which shows up because listing the contents of a directory requires opening a file which creates a file descriptor. You can trigger the same problem in Python using: [os.stat(f) for f in os.listdir('/proc/%d/fd' % os.getpid())] > But the total-file-size of about 280GB is incorrect. It should be > around 74GB. This could be because the size of /proc files are counted. Especially the /proc/kcore file is enormous. > For the above disk, DFS appears to consume much less memory ! > But the resulting file size is incorrect (280GB instead of 70GB). > This is presumably due to (NOTE-A) and the code must have scanned through > those > OTHER disks. But then the extra scanning appears to be incomplete! It's hard to say what might be up. But if the disks are mounted under the directory you supplied to each-file, then the files on those disks will be counted. > In closing, the simple code (with DFS) > 0 "a_path_to_big_folder" f [ link-info dup symbolic-link? [ drop ] [ > size>> + ] if ] each-file > could NOT achieve the intended action --- to sum up the file-size for files > residing in a > disk (as pointed to by a_path_to_big_folder). That is not surprising. Here is a better method to do it: USING: accessors combinators.short-circuit continuations io.directories.search io.files.info io.files.types kernel math math.order namespaces sets ; ! Filter hardlinks SYMBOL: seen-inos : regular-file-size ( file-info -- s ) ! In case it's one of the fake huge /proc files [ size>> ] [ size-on-disk>> ] bi min ; : count-file-info? ( link -- s ) { [ type>> +regular-file+ = ] [ { [ nlink>> 1 = ] [ ino>> seen-inos get ?adjoin ] } 1|| ] } 1&& ; : file-info-size ( link -- s ) dup count-file-info? [ regular-file-size ] [ drop 0 ] if ; : file-size ( path -- s ) [ link-info file-info-size ] [ 2drop 0 ] recover ; : du-tree ( path -- s ) HS{ } clone seen-inos set 0 swap t [ file-size + ] each-file ; It gives a decent disk usage counts for me. It underreports the total in comparison with "du -s --si" because I excluded directory sizes. -- mvh/best regards Björn Lindqvist ------------------------------------------------------------------------------ _______________________________________________ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk