Hi,
On 7/24/06, Chris Mattmann <[EMAIL PROTECTED]> wrote:
Thanks for your email. Jerome Charron and I proposed a project with a
similar goal in mind that we wanted to dub "Tika". Tika would effectively be
a Lucene sub-project, and would factor out some of the capabilities you
mention below from
Stefan Neufeind wrote:
Hi,
I might be bringing up old discussions (sorry if so) - but discussing
about segread/readseg I wondered why "prune" is missing in bin/nutch.
It's still working when you give the full classname by hand. But could
it be (re)added to bin/nutch again as well?
I think P
Hi,
I might be bringing up old discussions (sorry if so) - but discussing
about segread/readseg I wondered why "prune" is missing in bin/nutch.
It's still working when you give the full classname by hand. But could
it be (re)added to bin/nutch again as well?
Regards,
Stefan
I like it!
Am 24.07.2006 um 16:10 schrieb Andrzej Bialecki:
Stefan Neufeind wrote:
Andrzej Bialecki wrote:
Stefan Groschupf wrote:
Hi developers,
we have command like readdb and readlinkdb but segread. Wouldn't
be more consistent to name the command readseg instead segread?
... just a th
Stefan Neufeind wrote:
Andrzej Bialecki wrote:
Stefan Groschupf wrote:
Hi developers,
we have command like readdb and readlinkdb but segread. Wouldn't be
more consistent to name the command readseg instead segread?
... just a thought.
Yes, it seems more consistent. However, if we change it
[
http://issues.apache.org/jira/browse/NUTCH-322?page=comments#action_12423187 ]
Andrzej Bialecki commented on NUTCH-322:
-
Good questions ... ;)
ad 1: Google shows only the final page, and you can access it through both the
original (s
Andrzej Bialecki wrote:
Stefan Groschupf wrote:
Hi developers,
we have command like readdb and readlinkdb but segread. Wouldn't be
more consistent to name the command readseg instead segread?
... just a thought.
Yes, it seems more consistent. However, if we change it then scripts
people wr
Jukka Zitting wrote:
Hi,
Any interest in this?
definitely :-)
Michi
If not, is there some other Lucene project that
I should approach?
BR,
Jukka Zitting
On 7/18/06, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Hi,
I'm a committer of the Apache Jackrabbit project, and I've recently
been
Stefan Groschupf wrote:
Hi developers,
we have command like readdb and readlinkdb but segread. Wouldn't be
more consistent to name the command readseg instead segread?
... just a thought.
Yes, it seems more consistent. However, if we change it then scripts
people wrote would break. We could
Hi Jukka,
Thanks for your email. Jerome Charron and I proposed a project with a
similar goal in mind that we wanted to dub "Tika". Tika would effectively be
a Lucene sub-project, and would factor out some of the capabilities you
mention below from Nutch, incl:
1. MimeType repository
2. Parser i
Hi,
Any interest in this? If not, is there some other Lucene project that
I should approach?
BR,
Jukka Zitting
On 7/18/06, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Hi,
I'm a committer of the Apache Jackrabbit project, and I've recently
been working on improving the full text indexing support
Hi developers,
we have command like readdb and readlinkdb but segread. Wouldn't be
more consistent to name the command readseg instead segread?
... just a thought.
Stefan
[ http://issues.apache.org/jira/browse/NUTCH-324?page=all ]
Andrzej Bialecki closed NUTCH-324.
---
Fix Version/s: 0.8-dev
Resolution: Fixed
Patch applied, with minor whitespace diffs and doc. clarifications. Thank you!
> db.score.link.internal an
[ http://issues.apache.org/jira/browse/NUTCH-167?page=all ]
Andrzej Bialecki updated NUTCH-167:
Attachment: patch.txt
This patch implements support for Pragma: no-cache and Robots: noarchive.
Three "cache policies" are supported in this patch:
* C
[ http://issues.apache.org/jira/browse/NUTCH-329?page=all ]
Andrzej Bialecki closed NUTCH-329.
---
Resolution: Fixed
Fixed. Thanks!
> CrawlDbReader processTopNJob does not set jobNames
> --
>
>
[
http://issues.apache.org/jira/browse/NUTCH-322?page=comments#action_12422996 ]
Enrico Triolo commented on NUTCH-322:
-
Ok, I can see your point, nevertheless I think we should consider some
potential problems that could arise from such modi
Are you planning to update Hadoop to trunk/ ? I'd rather be careful
with that - I'm not sure if it's still compatible with Java 1.4,
besides being unreleased/unstable ...
Not planning an upgrade, just wan't to know if it resolves the issues.
We can then decide what's the best thing to do.
Sami Siren (JIRA) wrote:
[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12422929 ]
Sami Siren commented on NUTCH-266:
--
I finally found the time to setup an environment with cygwin and try this out. I can confirm that the
18 matches
Mail list logo