Marvin Humphrey wrote on 1/26/10 8:03 PM:
perl docmaker.pl \
--utf_factor=0 \
--write_files \
--tmp_dir path/to/my/testdocs/ \
--max_files 33000 \
--max_words 3 \
--tmp_dir_segments 2
I wonder whether this produces the same corpus on my OS X 10.5.8 MBPro as on
your system.
no, definitely different. docmaker.pl creates random strings based on your
system dictionary.
No matter what, I see the following output:
mar...@smokey:~/projects/ks/perl $ rm -rf test-ks-utf8/ ; perl -Mblib
karpet_utf8_test.pl testdocs/
Crawled 33000 documents
mar...@smokey:~/projects/ks/perl $
damn.
Before we go further, what kind of system are you having trouble on? Is it a
64-bit box?
yes, 64-bit. Tested on both RHEL 4 and Mac 10.6.
However, when I try to build on the two Linux boxen I have (32 and 64) with most
recent KS trunk I get this:
Initializing Charmonizer/Core/OperatingSystem...
Trying to find a bit-bucket a la /dev/null...
Creating compiler object...
Trying to compile a small test file...
_charm_run.c: In function ?main?:
_charm_run.c:26: error: expected expression before ?/? token
_charm_run.c:26: error: too few arguments to function ?freopen?
_charm_run.c:27: error: expected expression before ?/? token
_charm_run.c:27: error: too few arguments to function ?freopen?
failed to compile _charm_run helper utility
Failed to write charmony.h at buildlib/KinoSearch/Build.pm line 183.
make: *** [all] Error 25
could one of the changes you committed in the last 48 hours have caused that?
--
Peter Karman . http://peknet.com/ . [email protected]