At 5:45a -0400 on 28 May 2007, Yves Goergen wrote: > On 28.05.2007 09:06 CE(S)T, Kevin Hunter wrote: > >> In particular, I imagine a lot of the HTTP requests would be the >> same, so you could create a table to store the requested URLs, and >> then have a second table with the timestamp and foreign key >> relationship into the first. > > Interesting idea. Inserting would be more work to find the already > present "dictionary" rows.
My guess is not /that/ much work, since you should only have a known and relatively small set in this dictionary, it'd basically be cached the whole time. But, that's my guess. Haven't tried it. Practice and theory . . . > Also, URLs sometimes contain things like > session IDs. They're probably not of interest for my use but it's not > always easy to detect them for removal. Really? Why wouldn't it be easy to detect them? You presumably know what variable you're looking for in the URL string, and applying a simple regex search-and-replace . . . ? > IP addresses (IPv4) and especially return codes (which can be mapped to > a 1-byte value) are probably not worth the reference. Data size values > should be too distributed for this. Well, presumably, you'd normalize that part of the table. That is, rather than include multiple foreign keys in your "data" rows, you'd create a cartesian product of the the return codes with the dictionary table. You'd have a slightly more bloated dictionary, but depending on the number of requests the site(s) get(s), the aggregation would more than make up for it. > I could also parse user agent > strings for easier evaluation, but this takes me the possibility to add > support for newer browsers at a later time. (Well, I could update the > database from the original access log files when I've updated the UA > parser.) > Same thought. If you've only a known set of UA strings, you could normalize them with the dictionary table as well. > How large is a row reference? 4 bytes? I don't know, I'm fairly new to MySQL. I suppose it'd also matter on the type of index. Anyone more knowledgeable wanna pipe up? Well. Whatever method works for your needs, cool. I'm going to check out both MYISAMPACK and ARCHIVE. I was unaware of those. Thanks list! Kevin -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]