Hi.

In short and in my opinion, I think that you are re-inventing the wheel.
There exist already numerous open-source programs which analyse web logs, and generally produce nice-looking graphics etc.. from them. And they do the splitting-up work properly, as long as you feed them the correct log format. Their documentation indicates how to do that.
Look up webalizer, awstats etc..
Also, these programs are open-source, so you can look inside at how they do things, if you really want to write your own code.


yang Yang wrote:
Hi:
I am trying to develop a web based tool to track page hit counts, user
session activity and etc of our own sites.

I meet some problems:

1) How to distinguish a request target is a page or a resource?

For example,the following two logs(remove some parts):

#1-> [17/Sep/2010:11:38:26 +0800] "POST /test.jsp?name=test HTTP/1.1" 200
"test.jsp"
#2-> [17/Sep/2010:11:40:11 +0800] "POST /example/test.jpg HTTP/1.1" 200
"/example/test.jpg"
#3-> [17/Sep/2010:11:44:26 +0800] "POST /example/testServlet HTTP/1.1" 200
"test.jsp"
the pattern used in the above log is : '%t "%r" %s "%U"'.

The log #1 show a page request with a parameter, it can be use to calculate
the most frequently visited pages.

Log #2 show a resource(it is a image here) request, it can be used to
calculate the most frequently visited files.

Log#3 show a requst with nothing(it is a servlet),in fact it is a page.

That's to say, they are different request types,so how to distinguish them
in my codes?

2)Log parser.
I can read the log file line by line. But how to extract the value of each
attribute?
They are all in one line. Split them using the string.split() method? But
how if the value itself contains the separator?

For example, I use the split(" ") to split the log#1,but the value "POST
/example/test.jpg HTTP/1.1" will be splitted also,and this maybe
inefficient, so I wonder if there is a tool can make me do this easily?



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to