I've got two CGI scripts, one uses the old byte semantics (and techniques from the 5.00? days, including things like *data = parseInput();). The other uses character semantics, Encode, "open ( $pgHandle, '<:encoding(shiftjis)' $source )" and such.
The byte semantics version is about five times as fast, and I'm wondering if that difference is likely to be due to Encode and Unicode character semantics. I'm reading about 600KB of shift-JIS csv and using the above open statement. Then I'm depending on use encoding SRC_ENCODING, STDOUT => HTML_ENCODING to convert output to shift-jis on the fly, for compatibility with other pages on the site. I tried storing the data file as utf-8, but the page load completely stalls. (I guess I need to insert some debug prints and see what happened.) Would sure appreciate some pointers. -- Joel Rees, programmer, Systems Group Altech Corporation (Alpsgiken), Osaka, Japan http://www.alpsgiken.co.jp ---------------------- "When software is patentable, anything is patentable." (http://swpat.ffii.org)
