On Wed, Jun 04, 2014 at 08:47:54PM +0200, Jakub Narębski wrote:
> Michael Wagner wrote:
> > On Tue, May 27, 2014 at 04:22:42PM +0200, Jakub Narębski wrote:
> 
> >> Subject: [PATCH] gitweb: Harden UTF-8 handling in generated links
> >>
> >> esc_html() ensures that its input is properly UTF-8 encoded and marked
> >> as UTF-8 with to_utf8().  Make esc_param() (used for query parameters
> >> in generated URLs), esc_path_info() (for escaping path_info
> >> components) and esc_url() use it too.
> >>
> >> This hardens gitweb against errors in UTF-8 handling; because
> >> to_utf8() is idempotent it won't change correct output.
> [...]
> >>   sub esc_param {
> >>    my $str = shift;
> >>    return undef unless defined $str;
> >> +
> >> +  $str = to_utf8($str);
> >>    $str =~ s/([^A-Za-z0-9\-_.~()\/:@ ]+)/CGI::escape($1)/eg;
> >>    $str =~ s/ /\+/g;
> >> +
> >>    return $str;
> >>   }   
>  
> > While trying to view a "blob_plain" of "Gütekritierien.txt", a 404 error
> > occured. "git_get_hash_by_path" tries to resolve the hash with the wrong
> > filename (git ls-tree -z HEAD -- Gütekriterien.txt) and fails.
> > 
> > The filename needs the correct encoding. Something like this is probably
> > needed for all filenames and should be done at a prior stage:
> 
> True.
> 
> First, I wonder why the tests I did for this situation didn't
> show any errors even before the "harden href()" patch. What
> is different in your config that you see those errors?
> 

Nothing special. It is reproducible with git 1.9.3 (Fedora 20), git
instaweb (lighttpd) and LANG=de_DE.UTF-8.  
 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to