Below I am sharing a script that will parse CLF Apache web.log files and
place the data into Excel-friendly .csv files.

I do have a question, however: the script takes a LONG time to run because
of all of the DNS lookups. Is there any way to speed this up?

Thanks.

Ryan C. Christiansen
Web Developer
Intellisol International


REBOL []

log-file: read/lines ftp://username:[EMAIL PROTECTED]/logs/web.log

first-line: parse log-file/1 none
first-line/4: remove first-line/4 {[}
first-line/5: remove first-line/5 {]}
checksum-string: rejoin [first-line/4 " " first-line/5]
checksum-date: make date! checksum-string

csv-file-name: make file! (rejoin [checksum-date ".csv"])
write csv-file-name {User IP Address, User Domain Address, Date Hit, Time
Hit, File Hit, Bytes Transferred, Referring Page, Browser Type}
write/append csv-file-name (newline newline)

foreach log-line log-file [

    current-line: parse log-line none

    current-line/4: remove current-line/4 {[}
    current-line/5: remove current-line/5 {]}
    date-string: rejoin [current-line/4 " " current-line/5]
    hit-date: make date! date-string

    either not-equal? hit-date checksum-date [
        csv-file-name: make file! (rejoin [hit-date ".csv"])
        write csv-file-name {User IP Address, User Domain Address, Date
Hit, Time Hit, File Hit, Bytes Transferred, Referring Page, Browser Type}
        write/append csv-file-name (newline newline)

        current-line: parse log-line none
        IP-address: make tuple! current-line/1
        domain-address: read join dns:// IP-address
        current-line/4: remove current-line/4 {[}
        current-line/5: remove current-line/5 {]}
        date-string: rejoin [current-line/4 " " current-line/5]
        hit-date: make date! date-string
        parse date-string [thru ":" copy text to end (hit-time: make time!
text)]
        hit-file: current-line/6
        hit-bytes: current-line/8
        referring-page: make url! current-line/9
        browser-type: current-line/10

        write/append csv-file-name (rejoin [IP-address "," domain-address
"," hit-date "," hit-time "," hit-file "," hit-bytes "," referring-page ","
browser-type newline])

        checksum-date: hit-date
    ][
        current-line: parse log-line none
        IP-address: make tuple! current-line/1
        domain-address: read join dns:// IP-address
        current-line/4: remove current-line/4 {[}
        current-line/5: remove current-line/5 {]}
        date-string: rejoin [current-line/4 " " current-line/5]
        hit-date: make date! date-string
        parse date-string [thru ":" copy text to end (hit-time: make time!
text)]
        hit-file: current-line/6
        hit-bytes: current-line/8
        referring-page: make url! current-line/9
        browser-type: current-line/10

        write/append csv-file-name (rejoin [IP-address "," domain-address
"," hit-date "," hit-time "," hit-file "," hit-bytes "," referring-page ","
browser-type newline])

        print (rejoin [IP-address "," domain-address "," hit-date ","
hit-time "," hit-file "," hit-bytes "," referring-page "," browser-type
newline])

        next log-file
    ]
]

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to