[Nutch-dev] Fetcher Hung

2004-11-10 Thread Nutch Crawler
Hello, After running 'screen', I ran the following script using nutch 0.5: while [ true ] do nutch generate $db $segdir -topN $segsize seg=`ls -d $segdir/2* | tail -1` nutch fetch $seg nutch updatedb $db $seg nutch analyze $db 2 done I detached from screen, and let the crawler run. It

[Nutch-dev] Segment API

2004-11-10 Thread Andrzej Bialecki
Hi, I just committed a high-level API for working with segment data. The classes are located in net.nutch.segment.* package. The SegmentReader offers a superset of functionality of DumpSegment tool, therefore I'm removing that tool. Thanks to John Xing for providing the initial implementation!

[Nutch-dev] Greetings -- Seeking Development Colleagues for a Nutch-based Product Research and Development Project

2004-11-10 Thread Greg Deocampo
Hello, I am using Nutch in a product research and development cycle.  I am applying search technologies to certain commercial problems in digital media production and presentation.  I seek software development colleagues. Previous development efforts I've lead include Adobe AfterEffects, a di

Re: [Nutch-dev] Possible bug in FetchListTool

2004-11-10 Thread ogjunk-nutch
I'm using FetchListTool, but in a slightly less usual way, so I haven't seen the issue you are describing, but yes, the current code does look wrong. One could get even a little defensive and add a check for page.nextFetchTime() and do 'now + interval' only if that nextTextTime returns something b

Re: [Nutch-dev] Wanting to help implement nutch web service

2004-11-10 Thread m h
Andrzej - Thanks for the input. I'll will look into REST. On Wed, 10 Nov 2004 19:39:45 +0100, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: > M H, > > I would like to encourage you to re-evaluate your choice of SOAP as the > Web Service protocol. In my opinion the benefits of SOAP don't outwei

[Nutch-dev] Possible bug in FetchListTool

2004-11-10 Thread Andrzej Bialecki
Hi, Reading the code in FetchListTool I found the following snippet which puzzles me: 494// 495// Last, add the FetchListEntry to a file so we can 496// sort by score. Be sure to modify the Page's 497// fetchtime; this allows us to soon generate 498// another fetchlist which

Re: [Nutch-dev] domain

2004-11-10 Thread Stefan Groschupf
I own webspider.com, would you be interested in using the domain to implement nutch. To install nutch is very simple. Here you can find a tutorial: http://www.nutch.org/docs/en/tutorial.html HTH Stefan --- This SF.Net email is sponsored by: Syba

[Nutch-dev] domain

2004-11-10 Thread John
Hi I own webspider.com, would you be interested in using the domain to implement nutch. John --- This SF.Net email is sponsored by: Sybase ASE Linux Express Edition - download now for FREE LinuxWorld Reader's Choice Award Winner for best databa

Re: [Nutch-dev] Wanting to help implement nutch web service

2004-11-10 Thread Andrzej Bialecki
M H, I would like to encourage you to re-evaluate your choice of SOAP as the Web Service protocol. In my opinion the benefits of SOAP don't outweigh the fact that it's too complex and brings too much overhead for such a simple service - both in terms of server-side CPU, and in the bandwidth cos

[Nutch-dev] Antigen found VIRUS= HTML/MyDoo (Norman) virus

2004-11-10 Thread Antigen_MAIL
Antigen for Exchange found Body of Message infected with VIRUS= HTML/MyDoo (Norman) virus. The file is currently Removed. The message, "[Nutch-dev] Hey!", was sent from [EMAIL PROTECTED] and was discovered in SMTP Messages\Inbound located at Perfectinfo/First Administrative Group/MAIL. This ema

Re: [Nutch-dev] Wanting to help implement nutch web service

2004-11-10 Thread Doug Cutting
m h wrote: My company is using php on most of our website, but would like to interface with nutch via web services. I understand that there have been efforts in the past to use axis, but the current idea is to use jmx somehow. I must admit that I have zero experience with jmx, but if the nutch co