On 22/09/11 23:23, Javier Amor garcia wrote:
Hello,
I am working in a access.log parser for squid and I have trouble with
some URLs that contains no-us characters, like spanish accents.

To fix the issues with the parser I need to know the following:

The character encoding used for the log files is always the same or is
system dependent?.

Neither. It is configuration dependent.

see http://www.squid-cache.org/Doc/config/logformat/

ie
  "    output in quoted string format
  [    output in squid text log format as used by log_mime_hdrs
  #    output in URL quoted format
  '    output as-is
  -   left aligned

The default for URI fields should be URL-encoding according to the URI specifications. Which means RFC 1738 encoding of all non-ASCII characters in the path & query sections. puny-coding of characters in the host authority section (although the puny-coding is done by the browser, Squid is agnostic).

There is some way to explicitly force squid to use a given charset (or
UTF8) in its log files?.

All Squid log files are UTF-8. Some specific characters are URL-encoded to enforce one-line log entries. Otherwise not.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.15
  Beta testers wanted for 3.2.0.12

Reply via email to