Re: UTF-16 problem

2007-09-27 Thread Vasja Ocvirk
urls for next cycle sdfsdf test http://www.something.com";>something.com UTF8 - first two urls are fetched for next cycle and this is OK. sdfsdf test http://www.something.com";>something.com Thanks! Best regards, Vasja Doğacan Güney wrote: On 9/11/

UTF-16 problem

2007-09-11 Thread Vasja Ocvirk
Does anyone know what to do if Nutch doesn't crawl and index web pages in UTF-16? Did anyone had such a problem yet? Best regards, Vasja

Re: 0.8 much slower than 0.7

2006-08-02 Thread Vasja Ocvirk
ce >> >> >> Sami Siren wrote: >>> Are you experiencing slowness in general or just on some parts of >>> the process. >>> >>> Current fetcher is deadslow and it should be given immediate >>> attention. there have been some talk about th

Re: 0.8 much slower than 0.7

2006-08-02 Thread Vasja Ocvirk
me talk about the issue but I havent seen any code yet. -- Sami Siren Matthew Holt wrote: I agree. Is there anyway to disable something to speed it up? IE is the map reduce currently needed if we're not on a DFS? Matt Vasja Ocvirk wrote: Hello, I'm wondering if anyone can help. We in

0.8 much slower than 0.7

2006-07-26 Thread Vasja Ocvirk
Hello, I'm wondering if anyone can help. We injected 1000 seed URLs into Nutch 0.7.2 (basic configuration + 1000 URLs in regexp filter) and it processed them in just few hours. We just switched to 0.8 with same configuration, same URLs, but it seems everything slowed down significantly. Crawl