Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Sudhi Seshachala
[ ] +1 - yes, I vote for the proposal This is awesome --- On Thu, 4/1/10, Andrzej Bialecki wrote: From: Andrzej Bialecki Subject: [VOTE] Nutch to become a top-level project (TLP) To: nutch-user@lucene.apache.org Date: Thursday, April 1, 2010, 12:23 PM Hi all, According to an earlier [DISCUSS

Re: Host specific parsing

2009-07-28 Thread Sudhi Seshachala
Long time since I wrote  plugin. You could simply embed different logic in the same plugin - cant you? sudhi --- On Tue, 7/28/09, Koch Martina wrote: From: Koch Martina Subject: Host specific parsing To: "nutch-user@lucene.apache.org" Date: Tuesday, July 28, 2009, 2:24 AM Hi, has anyone bu

Re: Support needed

2009-07-28 Thread Sudhi Seshachala
As a very old nutch user an developer of plugins and even implemented nutch in some products - I could help you. I am based in Houston, Texas -- skype me on hooduku sudhi --- On Mon, 7/27/09, sf30098 wrote: From: sf30098 Subject: Support needed To: nutch-user@lucene.apache.org Date: Monday, J

Re: Running Nutch without Tomcat

2008-07-28 Thread Sudhi Seshachala
Michael, Yes you can run just the crawler part. Lucene provides the API to index the crawled data and search the indexed data. So short answer is you can do it. Thanks sudhi --- On Mon, 7/28/08, Michael Chan <[EMAIL PROTECTED]> wrote: From: Michael Chan <[EMAIL PROTECTED]> Subject: Running Nutc

Re: nutch crawl on a site that needs authentication

2006-07-30 Thread Sudhi Seshachala
Check the http plugins. It should have away. Just search the list, this question has been addressed already. Thanks Sudhi Deepa Devanathan <[EMAIL PROTECTED]> wrote: hi guys, I have a site i need to crawl but the very first page asks for a username, password. Is there a way I can supply these t

Re: any success with php-java-bridge and Nutch?

2006-07-26 Thread Sudhi Seshachala
For http://www.myopensourcejobs.com, we are uisng similar to OpensourceXML That works like a champ. I am not sure, if PHP-Java bridge would have any difference in terrms of perf. Thanks Sudhi Stefan Neufeind <[EMAIL PROTECTED]> wrote: Chris Stephens wrote: > Has anyone had succes

Re: Nutch with nsf files

2006-07-26 Thread Sudhi Seshachala
I do not believe we have a nsf-plugin. You will have to write a plugin to be able to handle nsf. There is a plugin tutorial on nutch wiki. Please refer to it. Thanks Sudhi Deepa Devanathan <[EMAIL PROTECTED]> wrote: hi guys, Can Nutch parse thru Lotus notes databases - .nsf files

Help associating domain name and ip address

2006-07-21 Thread Sudhi Seshachala
Hello Nutchians I am sure many of you would have experienced the same problem as me right now. I have a domain name http://www.myopensourcejobs.com I have my app hosted on a server (virtual dedicated server) 68.x.x.x in Go daddy. I want to configure and associate IPaddress and domain n

Re: 0.8 - Will not accept url list file on Windows

2006-07-18 Thread Sudhi Seshachala
Please try this command bin/nutch crawl search -dir /usr/data/crawl -depth 2 &> crawl.log & where search folder contains the list of files containing URLs. The crawler will crawl data into /usr/data/crawl/crawldb folder. crawl.log being the log file. Hope this helps. Thanks S

Re: 0.8 Dev Will not accept url list file on Windows

2006-07-18 Thread Sudhi Seshachala
Please try this command bin/nutch crawl search -dir /usr/data/crawl -depth 2 &> crawl.log & where search folder contains the list of files containing URLs. The crawler will crawl data into /usr/data/crawl/crawldb folder. crawl.log being the log file. Hope this helps. Thanks

Re: Could we configure nutch-site.xml with two directories?

2006-07-18 Thread Sudhi Seshachala
Oops, Ignore my previous mail. Just check the search.jsp. there is a parameter "lang". By default it is set to en. You could change based on the locale settings. Accordingly you could manage the search directories too. Please refer to search.jsp and Opensearchservlet. It is pretty straight for

Re: Could we configure nutch-site.xml with two directories?

2006-07-18 Thread Sudhi Seshachala
There are couple of ways that this could be done as per the mailing lists. One is, user is given the choice of selecting the directory or you could deploy two war files with different searcher.dir configured to correspinding conf folder Thanks Sudhi nasm <[EMAIL PROTECTED]> wrote: h

Re: Vertical Search (Nutch) for Opensource Jobs- http://www.myopensourcejobs.com

2006-07-18 Thread Sudhi Seshachala
in add same banners ads to rentavilce the server On 7/18/06, Sudhi Seshachala wrote: > Thanks. > I have written PArse plugins which pretty customizes the crawling and parses > according to the rules defined in PArse plugin. I have a index and Query > plugin specific to the domain I opera

Re: Vertical Search (Nutch) for Opensource Jobs- http://www.myopensourcejobs.com

2006-07-17 Thread Sudhi Seshachala
to run my crawlers). I have two machines running legacy fedora core2. Hope that helps. Thanks Sudhi Nutch Newbie <[EMAIL PROTECTED]> wrote: Good work! On 7/17/06, Sudhi Seshachala wrote: In addition for crawling, I have customized the process of crawling. > Just curiou

Vertical Search (Nutch) for Opensource Jobs- http://www.myopensourcejobs.com

2006-07-17 Thread Sudhi Seshachala
Hello Nutchians, Please visit the site http://www.myopensourcejobs.com. The site is built using LAMP and Nutch. I use the Nutch crawler to crawl jobs from commercial sites such as Hotjobs, DICE and CareerBuilder (As of today), specifically for opensource skill sets. Basically the site filter

RE: Title: search?

2006-06-26 Thread Sudhi Seshachala
p://tonalweb.com -Original Message----- From: Sudhi Seshachala [mailto:[EMAIL PROTECTED] Sent: Monday, June 26, 2006 11:44 PM To: nutch-user@lucene.apache.org Subject: Re: Title: search? You should be looking if title is indexed. Make sure index-basic plugin is included. If url is okay, it should be inc

Re: Title: search?

2006-06-26 Thread Sudhi Seshachala
You should be looking if title is indexed. Make sure index-basic plugin is included. If url is okay, it should be included. But to be doubly sure, I would go and investigate index-plugin. I am assuming you are using 0.8. Sudhi Seshachala <[EMAIL PROTECTED]> wrote: You should be l

Re: Title: search?

2006-06-26 Thread Sudhi Seshachala
You should be looking if title is Tonal Web Design - Stijn <[EMAIL PROTECTED]> wrote: I have all the plugins enabled for nutch, yet I'm not able to do queries like "title:keyword"? I can only do "url:" and "site:" what else should I look at to fix this? I also can't do ? Wild card searches.

Vertical Search

2006-03-09 Thread Sudhi Seshachala
Hello folks, I am working on adopting nutch for a vertical. I have been able to get it up and running in pretty basic scenarios. I need some help in getting up to speed in trying to crawl sites which has some weird encoding on the URLs. I am kind of lost, how to go about it? If some one can share s