OK, I appended "*" in the tail to skip those files. it seems the pages I
fetched got more lovely..~~)

2009/3/6 Alexander Aristov <[email protected]>

> 2009/3/5 Yves Yu <[email protected]>
>
> > yes. I saw a lot of css and gif and js files here, but I do set following
> > configurations in my crawl-urlfilter.txt
> > so ... I will enlarge depth to 50 and topN to 1000 and see what happened
> >
> > thank you very much..
> >
> > # skip image and other suffixes we can't yet parse
> >
> >
> -\.(js|JS|gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|sit|eps|wmf|zip|ppt|mpg|xls|gz|rpm|tgz|mov|MOV|exe|jpeg|JPEG|bmp|BMP)$
> >
>
> This affects only suffexes but in your cases CSS and JS end with random
> digits/letters
>  you  need to disable such mime type.
>
>
>
> >
> > 2009/3/6 Alexander Aristov <[email protected]>
> >
> > -
> >
>

Reply via email to