Hi everybody,
I haven't used either PLucene or PyLucene. That were
just guesses for what could be used. Thanks for pointing
to PyLucene though, didn't know it exists.
Jeff: Concerning your CGI implemenentation, you might
want to follow the OpenSearch API Thread. The REST
implementation proposed
Hi
I don't know why does not war target in build.xml(in svn) include
jakarta-oro-2.0.7.jar.
target name=war depends=jar,generate-docs
...
lib dir=${lib.dir}
include name=lucene*.jar/
include name=taglibs-*.jar/
include
Am 31.03.2005 um 03:55 schrieb Rohit Kulkarni:
I tried to search for the parse zip files plugin implementation you
mentioned...but couldn't find it
It was in the old bug tracking, but this is not avaiolable any more.
However your plugin is easy to realize.
Just uncompress the content and then
Doug Cutting wrote:
The proposal:
One more:
7. No code should call NutchConf.get() except a tool's main().
Doug
---
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security
John X wrote:
On Thu, Mar 31, 2005 at 12:45:39AM +0200, Stefan Groschupf wrote:
Actually it is difficult to have tools using ndfs and local file system.
What do people think about introducing a ndfs notation in paths like it
is used in protocol handlers? (ala http:// or file://)
I don't mean to
Actually. I was thinking about a program witten in Java using the CGI
interface instead of the servelet framework. Sounds like nobody is
working on that.
On Wed, 30 Mar 2005 2:40 am, Olaf Thiele wrote:
Hi Jeff,
as segments are stored in a home-grown file format you will
need to program your
Hi Stefan,
I am new to this mailing list and came across the parse zip file
plugin discussion..
Back in the days I already contributed such plugin.
Browse the old list archive or bugzilla.
I tried to search for the parse zip files plugin implementation you
mentioned...but couldn't find it
In particular, I would love to see a REST contribution.
Yes, I think it's a great idea too.
Once this is implemented, search.jsp can be replaced with a filter that
applies a stylesheet to XML search results.
Servlet = XML = HTML
instead of Servlet = HTML
In my opinion, it is the front-end
Dear Developers,
I have a problem:
I would like a page list (1-10) to end of hit-list (as google).
I have a problem when more hits from a site, the Hits.getTotal() not
return by the real end of the hits.
When I click to eg. on the 3. page, the result is an empty page
(NutchBean.search is out of
I propose we cleanup Nutch's tools as follows.
First, some definitions:
1. An action is an operation on Nutch data. For example,
GenerateSegmentFromDB, FetchSegment, UpdateDB, IndexSegment,
MergeIndexes, SearchServer, etc. are all actions.
2. A tool invokes an action from the command line.
The
Hi,
Just wanted to know if nutch supports date range search (say query for
web pages updated in last X days) and url search (like the site: in
google) yet. If yes what syntax should be used while giving the query
?
Thanks,
Rohit
---
This
Andrzej Bialecki wrote:
This is yet another case that speaks in favor of adding an
out-of-the-box XML API to Nutch.
Yes, I agree.
* REST - HTTP GET or POST request, with query parameters contained in
GET or POST parameters. An XML data document with results is a response.
Lightweight, easy to
MIME content type detector (using magic char sequences)
---
Key: NUTCH-33
URL: http://issues.apache.org/jira/browse/NUTCH-33
Project: Nutch
Type: New Feature
Reporter: Jerome Charron
Priority: Minor
Doug,
The proposal:
1. Actions and tools should be separate classes, in separate files.
Wonderful! :-) That will make a set of things (e.g. run nutch in a
container) very easy.
3. All actions must implement the following interface:
Inversion of control makes a lot of sense!
5. All plugins must
[ http://issues.apache.org/jira/browse/NUTCH-7?page=comments#action_61899 ]
Phoebe Miller commented on NUTCH-7:
---
I have fixed this problem by changing the update database tool, basically,
links from a page is not added if the page has already been
Andrzej Bialecki wrote:
This also nicely solves the non-obvious requirement that all ndfs paths
must begin with a slash...
I fixed that a while back. Things that don't start with a slash are
currently made relative to /user/$USER.
Doug
---
Jérôme Charron wrote:
Servlet = XML = HTML
instead of Servlet = HTML
In my opinion, it is the front-end dreamed architecture. But more
pragmatically, I'm not sure it's a good idea. XSL transformation is a
rather slow process!! And the Nutch front-end must be very responsive.
I don't think this
Is your change to the update db tool going to be in the next release?
Have you tested it?
Thanks for the fix!
-Original Message-
From: Phoebe Miller (JIRA) [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 31, 2005 8:59 AM
To: nutch-dev@incubator.apache.org
Subject: [jira] Commented:
The way I do it is thus:
When hits.totalIsExact(), the final page can be found simply from
hits.getTotal()
When NOT hits.totalIsExact(), I run the query again, this time retrieving
say 1000 urls (the max number of results I allow to be returned). Using a
loop (increment counter by number of
19 matches
Mail list logo