Here's a first draft of a new union.scm. On my YeeLoong 8101B, when building my 155-package profile, it uses about 1/14 as much CPU time (25 seconds vs 6 minutes), and about 2/7 as much real time (2 minutes vs about 7.17 minutes). It also handles libffi in core-updates properly. A few notes:
* For now, I'm using a hash table, because it turned out to be rather awkward to use a vhash for this. I can make it purely functional later if it's deemed important. * I've not yet updated tests/union.scm, which tests internal procedures in union.scm that no longer exist. * Although I can now run union-build in 2 minutes, another full minute is spent in 'display-search-paths', plus another 30-40 seconds in 'guix package' before the profile build begins. In the end, the total time for "guix package -i" goes from around 8.5 minutes to around 3.5 minutes: a significant improvement, but there's still more code for me to look at trying to optimize. Instead of a patch, I've simply attached the new union.scm. The only code they have in common is 'file=?', which is unchanged. Comments and suggestions welcome, Mark
;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2012, 2013, 2014 Ludovic Courtès <l...@gnu.org> ;;; Copyright © 2014 Mark H Weaver <m...@netris.org> ;;; ;;; This file is part of GNU Guix. ;;; ;;; GNU Guix is free software; you can redistribute it and/or modify it ;;; under the terms of the GNU General Public License as published by ;;; the Free Software Foundation; either version 3 of the License, or (at ;;; your option) any later version. ;;; ;;; GNU Guix is distributed in the hope that it will be useful, but ;;; WITHOUT ANY WARRANTY; without even the implied warranty of ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;;; GNU General Public License for more details. ;;; ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. (define-module (guix build union) #:use-module (ice-9 match) #:use-module (ice-9 format) #:use-module (ice-9 receive) #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:use-module (rnrs bytevectors) #:use-module (rnrs io ports) #:export (union-build)) ;;; Commentary: ;;; ;;; Build a directory that is the union of a set of directories, using ;;; symbolic links. ;;; ;;; Code: (define (names-in-directory dirname) (let ((dir (opendir dirname))) (let loop ((filenames '())) (match (readdir dir) ((or "." "..") (loop filenames)) ((? eof-object?) (closedir dir) filenames) (name (loop (cons name filenames))))))) (define (directory? name) (eq? 'directory (stat:type (stat name)))) (define (file=? file1 file2) "Return #t if FILE1 and FILE2 are regular files and their contents are identical, #f otherwise." (let ((st1 (stat file1)) (st2 (stat file2))) (and (eq? (stat:type st1) 'regular) (eq? (stat:type st2) 'regular) (= (stat:size st1) (stat:size st2)) (call-with-input-file file1 (lambda (port1) (call-with-input-file file2 (lambda (port2) (define len 8192) (define buf1 (make-bytevector len)) (define buf2 (make-bytevector len)) (let loop () (let ((n1 (get-bytevector-n! port1 buf1 0 len)) (n2 (get-bytevector-n! port2 buf2 0 len))) (and (equal? n1 n2) (or (eof-object? n1) (loop)))))))))))) (define* (union-build output inputs #:key (log-port (current-error-port))) (define (make-link input output) (format log-port "`~a' ~~> `~a'~%" input output) (symlink input output)) (define (union output inputs) (match inputs ((input) ;; There's only one input, so just make a link. (make-link input output)) (_ (receive (dirs files) (partition directory? inputs) (cond ((null? files) ;; All inputs are directories. Create a new directory ;; where we will merge the input directories. (mkdir output) ;; Build a hash table mapping each name to a list of input ;; directories containing that name. (let ((table (make-hash-table))) (define (add-to-table! name dir) (let ((handle (hash-create-handle! table name '()))) (set-cdr! handle (cons dir (cdr handle))))) ;; Populate the table. (for-each (lambda (dir) (for-each (cut add-to-table! <> dir) (names-in-directory dir))) dirs) ;; Now iterate over the table and recursively ;; perform a union for each entry. (hash-for-each (lambda (name dirs-with-name) (union (string-append output "/" name) (map (cut string-append <> "/" name) (reverse dirs-with-name)))) table))) ((null? dirs) ;; All inputs are non-directories. We assume they are files. (match files ((file (? (cut file=? <> file)) ...) ;; All files have the same contents, so there's no conflict. (make-link file output)) ((file other-files ...) (format (current-error-port) "warning: collision encountered: ~{~a ~}~%" files) ;; TODO: Implement smarter strategies. (format (current-error-port) "warning: arbitrarily choosing ~a~%" file) (make-link file output)))) (else ;; The inputs are a mixture of files and directories (error "union-build: collision between file and directories" `((files ,files) (dirs ,dirs))))))))) (setvbuf (current-output-port) _IOLBF) (setvbuf (current-error-port) _IOLBF) (when (file-port? log-port) (setvbuf log-port _IOLBF)) (union output (delete-duplicates inputs))) ;;; union.scm ends here