Re: [naviserver-devel] Escaping characters in access log

2015-07-15 Thread David Osborne
Thanks very much Gustaf. That looks great .

On 14 July 2015 at 17:59, Gustaf Neumann neum...@wu.ac.at wrote:

  Dear all,

 This is again a very reasonable request. Since most access-log analyzer
 are developed
 against apache rules, it seems that sticking to apache rules is sensible.
 ... although
 missing a few lines of hacking attempts is usually not an issue.

 i've added a small addition to the tip version that performs apache-style
 substitutions
 in the query fraction of the access log. The updated version performs
 apache-style
 escaping for all double-quoted fields depending potentially on external
 input,
 such as the user agent field or the referrer field.

 all the best
 -g

 Am 14.07.15 um 09:05 schrieb David Osborne:

 Hi,

  We're coming up against a problem where we attempt to parse data in a
 naviserver access log to analyse server use.

  We were relying on the combined log format being parsable but are
 running into difficulties when non-percent encoded characters are making
 their way into the logged request.

  For example, the URL for testing for a XSS exploit:

  /tiki-list_file_gallery.php/scriptalert(document.domain)/script

  This will be logged to the access log as:

  9.9.9.9 - - [14/Jul/2015:14:55:34 +0100] GET
 /tiki-list_file_gallery.php/scriptalert(document.domain)/script
 HTTP/1.0 404 737  curl/7.26.0 1436882134.386210 0.038129 0.000129
 0.16 0.000152

  Because of the unescaped quote we can't reliably parse this entry.

  I wasn't sure what the server should do in cases like this. The quote
 should technically be percent encoded but clients like curl allow the raw
 character to be sent.

  Apache escapes quotes by prefixing a backslash before logging:
 http://httpd.apache.org/docs/2.2/mod/mod_log_config.html
 Exceptions from this rule are  and \, which are escaped by prepending a
 backslash, and all whitespace characters, which are written in their
 C-style notation (\n, \t, etc)

 Nginx replaces quotes in the log with \x22:

 http://trac.nginx.org/nginx/changeset?old_path=%2Fnginxold=66dc85397a9006d5ecdd74c56d9eac1fd479b5d6new_path=%2Fnginxnew=66dc85397a9006d5ecdd74c56d9eac1fd479b5d6

 Do we have any means of doing something similar in Naviserver?

  --
  David




 --
 Don't Limit Your Business. Reach for the Cloud.
 GigeNET's Cloud Solutions provide you with the tools and support that
 you need to offload your IT needs and focus on growing your business.
 Configured For All Businesses. Start Your Cloud Today.
 https://www.gigenetcloud.com/
 ___
 naviserver-devel mailing list
 naviserver-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/naviserver-devel


--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] Escaping characters in access log

2015-07-14 Thread Gustaf Neumann

Dear all,

This is again a very reasonable request. Since most access-log analyzer 
are developed
against apache rules, it seems that sticking to apache rules is 
sensible. ... although

missing a few lines of hacking attempts is usually not an issue.

i've added a small addition to the tip version that performs 
apache-style substitutions
in the query fraction of the access log. The updated version performs 
apache-style
escaping for all double-quoted fields depending potentially on external 
input,

such as the user agent field or the referrer field.

all the best
-g

Am 14.07.15 um 09:05 schrieb David Osborne:

Hi,

We're coming up against a problem where we attempt to parse data in a 
naviserver access log to analyse server use.


We were relying on the combined log format being parsable but are 
running into difficulties when non-percent encoded characters are 
making their way into the logged request.


For example, the URL for testing for a XSS exploit:

/tiki-list_file_gallery.php/scriptalert(document.domain)/script

This will be logged to the access log as:

9.9.9.9 - - [14/Jul/2015:14:55:34 +0100] GET 
/tiki-list_file_gallery.php/scriptalert(document.domain)/script 
HTTP/1.0 404 737  curl/7.26.0 1436882134.386210 0.038129 
0.000129 0.16 0.000152


Because of the unescaped quote we can't reliably parse this entry.

I wasn't sure what the server should do in cases like this. The quote 
should technically be percent encoded but clients like curl allow the 
raw character to be sent.


Apache escapes quotes by prefixing a backslash before logging:
http://httpd.apache.org/docs/2.2/mod/mod_log_config.html
Exceptions from this rule are || and |\|, which are escaped by 
prepending a backslash, and all whitespace characters, which are 
written in their C-style notation (|\n|, |\t|, etc)


Nginx replaces quotes in the log with \x22:
http://trac.nginx.org/nginx/changeset?old_path=%2Fnginxold=66dc85397a9006d5ecdd74c56d9eac1fd479b5d6new_path=%2Fnginxnew=66dc85397a9006d5ecdd74c56d9eac1fd479b5d6

Do we have any means of doing something similar in Naviserver?

--
David


--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel