I have just implemented flx_cp:
~/felix>tools/flx_cp src/lib/std/posix '(.*)[.]flx' 'xx/yy/\1.bak'
copy src/lib/std/posix/filesystem.flx -> xx/yy/filesystem.bak OK
copy src/lib/std/posix/flx_faio_posix.flx -> xx/yy/flx_faio_posix.bak OK
copy src/lib/std/posix/mmap.flx -> xx/yy/mmap.bak OK
copy src/lib/std/posix/posix_headers.flx -> xx/yy/posix_headers.bak OK
copy src/lib/std/posix/process.flx -> xx/yy/process.bak OK
copy src/lib/std/posix/signal.flx -> xx/yy/signal.bak OK
This is a sane copy command (unlike cp).
flx_cp srcdir srcpat dstpat [--test]
recursively scans srcdir, looking for all files with names *within* the srcdir
(i.e. not counting the srcdir) matching the Perl regexp srcpat, and copying
files to dstpat, where \1 \2 etc are replace by the match groups from the src
regexp.
The --test option tells flx_cp not to actually do anything, just print what it
will do.
Target directories are auto-created.
The command will never clobber a src file with a destination,
and it will not permit two copies to the same destination.
It does currently overwrite target files. Files are create with
read/write/execute permission for the current user, with
the current time/date for creation and modification stamps
(this needs to be thought about).
The implementation uses Googe Re2 for the regexps and JudyArrays
for the mapping.
We keep three Judy arrays:
src -> dst: a JudySLArray
dst -> src: a JudySLArray
dirs: a JudySLArray with 0 value
The src->dst collects all the source to destination mappings.
The dst->src inverse is used to check two copies don't go to the same target.
After building these data structure we check no dst is also a src.
The dirs directory lists all the directories which need to exist for the targets
to be copied to, these are then created (again with user access) if they
don't already exist.
Somewhat unfortunately, we do not have a JudyS1Array: two of the arrays
provide a mapping when we actually only needed a set.
What makes this command "sane" is that it treats all regular files in the
srcdir,
recursively, as candidates for matching. Directories are NEVER copied.
At present (I think) symlinks and other stuff is also ignored.
Secondly, the target can be derived from the source. You can't do that with
"cp".
"cp" is so bad, it is worse than DOS copy (which could do wildcard replacement
in the target).
The src directory is required primarily as an optimisation, so the scan doesn't
universally have to scan the whole file system.
The checks ensure the command is idempotent (a second copy will do nothing),
and injective (all targets are distinct), and the range doesn't overlap the
source.
It should be possible to do a nice trick: in --test mode emit shell commands:
mkdir fred
cp a b
cp d e
etc so you can save the result in a file and execute it later (i.e. use flx_cp
to schedule copying, rather tan actually do it). In particular
`flx_cp srcdir srcpat dstpat --test`
would be similar to flx_cp with the --test, except it uses "cp" to do the
copying (and thus preserved file modification time, etc etc).
Because we're using Judy arrays, file name may not contain a 0 byte.
To make all this work I had to get the Garbage Collector to accept custom
pointer scanners since JudyArrays hide pointers away, and we need special
routines to find pointers in Judy Arrays.
--
john skaller
[email protected]
------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Felix-language mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/felix-language