I'm setting up an export control scan to comply with federal export control
laws. My peers at other business units are all using (well,
attempting to use AltaVista -- I have gotten *much* further *much* faster
with ht://Dig), but I have choosen ht://Dig (and a good dose of Perl ;-D).
Anyways, I have been able to parse MS Word and MS Excel files with ht://Dig,
but I am also required to look at other MS Office documents (i.e.
powerpoint). Does anyone have an external parser for me??!! My peers keep
telling me that AltaVista has all of these "filters" (aka parsers), but I
haven't seen/used them...
I am also having difficulty with htmerge on a fairly large (and it will only
grow larger) index. The specific error seems to be coming from the sort
command. When using the standard sort included with Solaris 2.5.1 I get:
# htmerge -c conf/unix.conf -v -s
htmerge: Sorting...
sort: can't create /home/atlantis8/bigler/stmAAAa00598/a: Not a directory
htmerge: Word sort failed
and when using the GNU sort included with textutils-1.22 I get:
# htmerge -c conf/unix.conf -v -s
htmerge: Sorting...
/home/atlantis3/bigler/opt/bin/sort: read error: Invalid argument
htmerge: Word sort failed
Any help would be *greatly* appreciated. I had rather not go the other
direction and be forced into AltaVista.... ;-D And I'd like to deliver a
solution way ahead of the "other guy". ;-D
Thanks,
Tyson
---
M. Tyson Bigler SEPTCo Computing Solutions Group
Infrastructure Support Bellaire Technology Center
[EMAIL PROTECTED] 3737 Bellaire Blvd., Room 1007B
713-245-7476 Houston, TX 77025
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.