Hello,
After running 'screen',
I ran the following script using nutch 0.5:
while [ true ]
do
nutch generate $db $segdir -topN $segsize
seg=`ls -d $segdir/2* | tail -1`
nutch fetch $seg
nutch updatedb $db $seg
nutch analyze $db 2
done
I detached from screen, and let the crawler run.
It
Hi,
I just committed a high-level API for working with segment data. The
classes are located in net.nutch.segment.* package.
The SegmentReader offers a superset of functionality of DumpSegment
tool, therefore I'm removing that tool. Thanks to John Xing for
providing the initial implementation!
Hello,
I am using Nutch in a product research and
development cycle. I am applying search technologies to certain
commercial
problems in digital media production and presentation. I seek software
development colleagues.
Previous development efforts I've lead include Adobe AfterEffects, a
di
I'm using FetchListTool, but in a slightly less usual way, so I haven't
seen the issue you are describing, but yes, the current code does look
wrong. One could get even a little defensive and add a check for
page.nextFetchTime() and do 'now + interval' only if that nextTextTime
returns something b
Andrzej -
Thanks for the input. I'll will look into REST.
On Wed, 10 Nov 2004 19:39:45 +0100, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
> M H,
>
> I would like to encourage you to re-evaluate your choice of SOAP as the
> Web Service protocol. In my opinion the benefits of SOAP don't outwei
Hi,
Reading the code in FetchListTool I found the following snippet which
puzzles me:
494//
495// Last, add the FetchListEntry to a file so we can
496// sort by score. Be sure to modify the Page's
497// fetchtime; this allows us to soon generate
498// another fetchlist which
I own webspider.com, would you be interested in using the domain to
implement nutch.
To install nutch is very simple.
Here you can find a tutorial:
http://www.nutch.org/docs/en/tutorial.html
HTH
Stefan
---
This SF.Net email is sponsored by:
Syba
Hi
I own webspider.com, would you be interested in using the domain to
implement nutch.
John
---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best databa
M H,
I would like to encourage you to re-evaluate your choice of SOAP as the
Web Service protocol. In my opinion the benefits of SOAP don't outweigh
the fact that it's too complex and brings too much overhead for such a
simple service - both in terms of server-side CPU, and in the bandwidth
cos
Antigen for Exchange found Body of Message infected with VIRUS= HTML/MyDoo
(Norman) virus.
The file is currently Removed. The message, "[Nutch-dev] Hey!", was
sent from [EMAIL PROTECTED] and was discovered in SMTP Messages\Inbound
located at Perfectinfo/First Administrative Group/MAIL.
This ema
m h wrote:
My company is using php on most of our website, but would like to
interface with nutch via web services. I understand that there have
been efforts in the past to use axis, but the current idea is to use
jmx somehow. I must admit that I have zero experience with jmx, but
if the nutch co
11 matches
Mail list logo