I have just implemented flx_cp:

~/felix>tools/flx_cp src/lib/std/posix '(.*)[.]flx' 'xx/yy/\1.bak' 
copy src/lib/std/posix/filesystem.flx -> xx/yy/filesystem.bak OK
copy src/lib/std/posix/flx_faio_posix.flx -> xx/yy/flx_faio_posix.bak OK
copy src/lib/std/posix/mmap.flx -> xx/yy/mmap.bak OK
copy src/lib/std/posix/posix_headers.flx -> xx/yy/posix_headers.bak OK
copy src/lib/std/posix/process.flx -> xx/yy/process.bak OK
copy src/lib/std/posix/signal.flx -> xx/yy/signal.bak OK

This is a sane copy command (unlike cp).

flx_cp srcdir srcpat dstpat [--test]

recursively scans srcdir, looking for all files with names *within* the srcdir
(i.e. not counting the srcdir) matching the Perl regexp srcpat, and copying
files to dstpat, where \1 \2 etc are replace by the match groups from the src 
regexp.

The --test option tells flx_cp not to actually do anything, just print what it 
will do.

Target directories are auto-created.

The command will never clobber a src file with a destination,
and it will not permit two copies to the same destination.
It does currently overwrite target files. Files are create with
read/write/execute permission for the current user, with
the current time/date for creation and modification stamps
(this needs to be thought about).

The implementation uses Googe Re2 for the regexps and JudyArrays
for the mapping.

We keep three Judy arrays:

src -> dst:  a JudySLArray
dst -> src: a JudySLArray
dirs: a JudySLArray with 0 value

The src->dst collects all the source to destination mappings.
The dst->src inverse is used to check two copies don't go to the same target.
After building these data structure we check no dst is also a src.
The dirs directory lists all the directories which need to exist for the targets
to be copied to, these are then created (again with user access) if they
don't already exist.

Somewhat unfortunately, we do not have a JudyS1Array: two of the arrays 
provide a mapping when we actually only needed a set.

What makes this command "sane" is that it treats all regular files in the 
srcdir,
recursively, as candidates for matching. Directories are NEVER copied.
At present (I think) symlinks and other stuff is also ignored.

Secondly, the target can be derived from the source. You can't do that with 
"cp".
"cp" is so bad, it is worse than DOS copy (which could do wildcard replacement 
in the target).

The src directory is required primarily as an optimisation, so the scan doesn't
universally have to scan the whole file system.

The checks ensure the command is idempotent (a second copy will do nothing),
and injective (all targets are distinct), and the range doesn't overlap the 
source.

It should be possible to do a nice trick: in --test mode emit shell commands:

mkdir fred
cp a b 
cp d e 

etc so you can save the result in a file and execute it later (i.e. use flx_cp
to schedule copying, rather tan actually do it). In particular

        `flx_cp srcdir srcpat dstpat --test`

would be similar to flx_cp with the --test, except it uses "cp" to do the
copying (and thus preserved file modification time, etc etc).

Because we're using Judy arrays, file name may not contain a 0 byte.

To make all this work I had to get the Garbage Collector to accept custom
pointer scanners since JudyArrays hide pointers away, and we need special
routines to find pointers in Judy Arrays.

--
john skaller
skal...@users.sourceforge.net





------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to