Re: Hi

2010-05-06 Thread Harry Nutch
Did u check crawl-urlfilter.txt? All the domain names that you'd like to crawl have to mentioned. e.g. # accept hosts in MY.DOMAIN.NAME +^http://([a-z0-9]*\.)*mersin\.edu\.tr/ +^http://([a-z0-9]*\.)*tubitak\.gov\.tr/ Also check property db.ignore.external.links in nutch-default.xml. Should be se

Hi

2010-05-06 Thread Zehra Göçer
i have problems about nutch.my project is link analysis i crawled "www.mersin.edu.tr" and i analyse linkdb and i saw all about mersin.edu.tr links.But i have to find other links in site example www.tubitak.gov.tr bu i cannot find?i have to find these links ?please help me

Re: Hi, and help with inject scoring...

2010-03-24 Thread Toby Cole
Excellent, I'll have a look at the patch. Thanks, T On 23/03/2010 19:25, Julien Nioche wrote: Hi Toby, Have a look at https://issues.apache.org/jira/browse/NUTCH-655 The patch has been committed to the SVN repository and should allow you to do exactly what you described. HTH Julien

Re: Hi, and help with inject scoring...

2010-03-23 Thread Julien Nioche
Hi Toby, Have a look at https://issues.apache.org/jira/browse/NUTCH-655 The patch has been committed to the SVN repository and should allow you to do exactly what you described. HTH Julien -- DigitalPebble Ltd http://www.digitalpebble.com On 23 March 2010 17:35, Toby Cole wrote: > Hi Nu

Hi, and help with inject scoring...

2010-03-23 Thread Toby Cole
Hi Nutch list, We're using nutch for what basically amounts to an intranet crawl (just a few domains). We have a HUGE inject list as the site contains a lot of Ajax pages. What I'm wondering is… is there a simple way of getting the injected URLs to have a higher default score

Re: hi Kubes:the question about develop environment!

2009-04-23 Thread Susam Pal
On Thu, Apr 23, 2009 at 12:09 PM, askNutch wrote: > > > can hadoop run in vmware machine? > I am running a Hadoop cluster where each node is a VMware virtual machine. So, yes, it is possible. As long as you are able to connect to sockets from one virtual machine to another, I don't see why you c

Re: hi Kubes:the question about develop environment!

2009-04-23 Thread Dennis Kubes
askNutch wrote: hi kubes: thank you for your answers! i'm sorry that i didn't express my question. i run nutch only on one machine! and ,i cann't debug hadoop in nutch.because the hadoop's exist is lib. how can i debug hadoop source in nutch? Build hadoop from scrat

Re: hi Kubes:the question about develop environment!

2009-04-22 Thread askNutch
hi kubes: thank you for your answers! i'm sorry that i didn't express my question. i run nutch only on one machine! and ,i cann't debug hadoop in nutch.because the hadoop's exist is lib. how can i debug hadoop source in nutch? and to my surprise ,the Tutorial "RunNutchIn

Re: hi Kubes:the question about develop environment!

2009-04-22 Thread Alexander Aristov
ennis Kubes > > > Alexander Aristov wrote: > >> Why not to post such mails personally if you address to single person? >> >> Want to know other opinions? >> > > I would :) > > Dennis > > > >> Best Regards >> Alexander Aristov

Re: hi Kubes:the question about develop environment!

2009-04-22 Thread Dennis Kubes
Alexander Aristov wrote: Why not to post such mails personally if you address to single person? Want to know other opinions? I would :) Dennis Best Regards Alexander Aristov 2009/4/22 askNutch hi Kubes: You are the expert! Can you tell me What is the develop

Re: hi Kubes:the question about develop environment!

2009-04-22 Thread Dennis Kubes
askNutch wrote: hi Kubes: You are the expert! Can you tell me What is the develop environment do you use to develop nutch ? Linux, Ubuntu (usually the most recent), sun jdk, core2 laptop (although hoping to upgrade to a sagernotebook.com quad core soon

Re: hi Kubes:the question about develop environment!

2009-04-21 Thread Alexander Aristov
Why not to post such mails personally if you address to single person? Want to know other opinions? Best Regards Alexander Aristov 2009/4/22 askNutch > > hi Kubes: >You are the expert! > >Can you tell me What is the develop environment do you use to

hi Kubes:the question about develop environment!

2009-04-21 Thread askNutch
hi Kubes: You are the expert! Can you tell me What is the develop environment do you use to develop nutch ? such as IDE etc. I want to debug nutch. thank you !!! -- View this message in context: http://www.nabble.com/hi

Re: Hi What is the use of refine-query-init.jsp,refine-query.jsp

2007-03-12 Thread Enis Soztutar
inalasuresh wrote: Hi , I am uncommented the refine-query.jsp and refine-query-init.jsp in the search.jsp i searched for bikekeyword it given result. Before that i am trying to run the application with comments & witout comments . but that had given the same result. so plz any one

Re: Hi what is the use of subcollections.xml

2007-03-12 Thread Enis Soztutar
inalasuresh wrote: Hi , Any one help me. i am new for nutch.. what is the use of subcollections.xml when it is called. plz give the response for my query,... thanx & regards suresh.. Hi, Subcollections is a plugin for indexing the urls matching a regular expression and subcollections

Hi what is the use of subcollections.xml

2007-03-12 Thread inalasuresh
Hi , Any one help me. i am new for nutch.. what is the use of subcollections.xml when it is called. plz give the response for my query,... thanx & regards suresh.. -- View this message in context: http://www.nabble.com/Hi-what-is-the-use-of-subcollections.xml-tf3389528.html#a9434780 Sent

Hi What is the use of refine-query-init.jsp,refine-query.jsp

2007-03-12 Thread inalasuresh
Hi , I am uncommented the refine-query.jsp and refine-query-init.jsp in the search.jsp i searched for bikekeyword it given result. Before that i am trying to run the application with comments & witout comments . but that had given the same result. so plz any one can sugest me what is

Hi What is the use of refine-query-init.jsp,refine-query.jsp

2007-03-12 Thread inalasuresh
Hi , I am uncommented the refine-query.jsp and refine-query-init.jsp in the search.jsp i searched for bikekeyword it given result. Before that i am trying to run the application with comments & witout comments . but that had given the same result. so plz any one can sugest me what is the

Re: Hi...How to set Nutch-0.8.1 to save logs into log files when running the crawl job?

2006-12-21 Thread Sean Dean
verbosely. fetcher.verbose false If true, fetcher will log more verbosely. - Original Message From: kevin <[EMAIL PROTECTED]> To: nutch-user@lucene.apache.org Sent: Thursday, December 21, 2006 10:55:38 PM Subject: Hi...How to set Nutch-0.8.1 to save logs into log files when running the

Hi...How to set Nutch-0.8.1 to save logs into log files when running the crawl job?

2006-12-21 Thread kevin
Hi, How to set Nutch-0.8.1 to save logs into log files when running the crawl job? Is it setting in the nutch-site.xml, or other configuration file? Thanks your help in advance! -- kevin

hi all

2006-11-03 Thread kauu
hi i have a problem now. i want to crawl the pages which's url contain "...item_detail",but i must crawl from the www..com ,and if i set rules in the "crawl-urlfilter.txt",i can't get the pages what i want at all. so what i need to do now ? should

RE: hi all

2006-04-02 Thread Dan Morrill
Andrzej, Cheers! Good to know. Thanks! r/d -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Sunday, April 02, 2006 5:01 PM To: nutch-user@lucene.apache.org Subject: Re: hi all Dan Morrill wrote: > Since you are using Luke to see the index, luke may not have

Re: hi all

2006-04-02 Thread Andrzej Bialecki
there is nothing that states they support any character set. When you run your search, do you see good characters, or do you see gork? Luke may not be able to understand the ISO character sets. (Hypothesis). Hi, (I'm the guy behind Luke) Luke uses UTF-8, because that's what Luc

Re: hi all

2006-04-02 Thread kauu
states > they support any character set. > > When you run your search, do you see good characters, or do you see gork? > Luke may not be able to understand the ISO character sets. (Hypothesis). > > r/d > > -Original Message- > From: kauu [mailto:[EMAIL PROTECTED

RE: hi all

2006-04-02 Thread Dan Morrill
Subject: Re: hi all thx for advice! now i know what's up. but my OS is WinXp(CHINESE), it supports Chinese very well. and i used the LUKE to see the index, ant there are messy character when crawl the Chinese webs. so ,how can i deal with it?? any reply will be appreciated. On 4/2/06

Re: hi all

2006-04-02 Thread kauu
AIL PROTECTED] > Sent: Sunday, April 02, 2006 7:48 AM > To: nutch-user@lucene.apache.org > Subject: hi all > > hi all: >i get a big problem when crawl the ftp. > it seems that Nutch couldn't parse or index the files named in > Chinese > so after the command

RE: hi all

2006-04-02 Thread Dan Morrill
pack installed to properly support Chinese. Personally, I would download the language pack for your Operating system and see what happens. r/d -Original Message- From: kauu [mailto:[EMAIL PROTECTED] Sent: Sunday, April 02, 2006 7:48 AM To: nutch-user@lucene.apache.org Subject: hi all hi

hi all

2006-04-02 Thread kauu
hi all: i get a big problem when crawl the ftp. it seems that Nutch couldn't parse or index the files named in Chinese so after the command looks like: bin/nutch crawl urls.txt -dir test.dir (i've modified the crawl-urlfilter.txt) # skip file:, ftp:, & mailto: urls #-^(f

RE: Hi how can I do a incremental crawling

2005-12-05 Thread Goldschmidt, Dave
Hi Kumar, I'm not a Nutch expert, but I think you'd need to re-crawl all URLs to determine if they changed since the last crawl, yes? Depending on what you're doing, you might re-crawl URLs that are most frequently accessed by users, or keep track per crawl how often pages change

Hi how can I do a incremental crawling

2005-12-04 Thread Kumar Limbu
Hi, I am trying to create a site which will crawl a handful no. of sites. I am using whole-web crawling to crawl these sites. The problem is I am don't know how to do a incremental crawling, i.e. only fetch and update, the webpages which has changed since last crawled. Thank you all. --