Sorry. I forgot to mention that I'm running a 2.x release taken from a few weeks ago.
On Wed, Jun 12, 2013 at 8:31 AM, Bai Shen <[email protected]> wrote: > I'm dealing with a lot of file types that I don't want to index. I was > originally using the regex filter to exclude them but it was getting out of > hand. > > I changed my plugin includes from > > urlfilter-regex > > to > > urlfilter-(regex|suffix) > > I've tried using both the default urlfilter-suffix.txt file via adding the > extensions I don't want and making my own file that starts with + and > includes the extensions I do want. > > Neither of these approaches seem to work. I continue to get urls added to > the database which continue extensions I don't want. Even adding a > urlfilter.order section to my nutch-site.xml doesn't work. > > I don't see any obvious bugs in the code, so I'm a bit stumped. Any > suggestions for what else to look at? > > Thanks. >

