Hi Owen,
I'm from the Carrot2 project, so I feel called to the blackboard:
One source for how to do this is the thesis of Stanislaw Osinski and
others like it:
http://www.dcs.shef.ac.uk/teaching/eproj/msc2004/abs/m3so.htm
And the Carrot2 project which uses similar techniques.
http://www.cs
Hi Owen,
Last year it was suggested Carrot2 could help, and it would even produce
good labels for the clusters. Has this proven to be true?
Yes, Carrot2 should help you with this. The labels it creates highly
depend on the quality of the input snippets, but the so-called KWIK
snippets (keywor
Karl,
Two things, try to experiment with both:
1) I would try to write a lexical scanner that strips HTML tags, much
like the regular expression does. Java lexical scanner packages produce
nice pure Java classes that seldom use any advanced API, so they should
work on Java 1.1. They are simple s
Hi Adam.
Otis and David have already provided you with pointers to my previous
post regarding Carrot2-Lucene integration, so just a tiny note here:
Also, when I looked at Carrot2 the pipe line is implemented as over http. I
wonder how efficient that is, or can it be changed, for instance for an
Hi.
Coming up with answers... a little belated, but hope you're still on:
we have been experimenting with carrot2 and are very pleased so far,
only one issue: there is no release not even an alpha one and the
dependencies seemed to be patched (jama)
Yes, there is not "official" release. We just don
Hi David,
I apologize about the delay in answering this one, Lucene is a busy
mailing list and I had a hectic last week... Again, sorry for belated
answer, hope you still find it useful.
That is awesome and very inspirational!
Yes, I admit what you've done with Wikipedia is quite interesting and
It is quite interesting, Erik, thanks for the link. I'm sure you're
aware of the post-search clustering addon to Nutch that is based on the
project I'm heading -- Carrot2. If you have any ideas of how this could
be made better, I'm always open to suggestions.
Regards,
Dawid
http://www.cs.put.po
No problem. Let people know if it worked for you -- I look forward to
hearing your experiences (good or bad).
Dawid
William W wrote:
Thanks Dawid ! :)
From: Dawid Weiss <[EMAIL PROTECTED]>
Reply-To: "Lucene Users List" <[EMAIL PROTECTED]>
To: Lucene Users List <[
ilter on top of what it returns. Shouldn't be too hard.
Dawid
Albert Vila wrote:
That's great, thanks dawid.
Just a question, how can I modify your code in order to use the
carrot2-output-xsltrenderer to output the clustering results in a html
page?
Can you provide an example?
Tha
Hi William,
Ok, here is some demo code I've put together that shows how you can
achieve clustering of Lucene's results. I hope this will get you started
on your projects. If you have questions, please don't hesitate to ask --
cross posts to carrot2-developers would be a good idea too.
The code
nothing to do with each other furthermore, Arabic uses phonetic
indicators on each letter called diacritics that change the way you
pronounce the word which in turn changes the words meaning so two word
spelled exactly the same way with different diacritics will mean two
separate things,
Just
Nutch has a crawler. So does Egothor (the crawler is called Capek). If
you type "web crawler" in Google you'll get tons of projects.
Dawid
Zhang, Lisheng wrote:
Hi,
Does anyone know if there is free-software to crawl internet site
(webcrawler)? I know currently lucene does not have this feature
a
ts -- he's just very shy by nature and doesn't talk much, hehe.
D.
William W wrote:
Hi Dawid,
The demos (under /src/demo) are very good. They have the basic usage
scenario.
Thanks Andrzej.
William.
Dawid Weiss wrote:
Hi William,
No, I don't have examples because I never used Lucene
I get back.
D.
Andrzej Bialecki wrote:
Dawid Weiss wrote:
Hi William,
No, I don't have examples because I never used Lucene directly. If you
provide me with a sample index and an API that executes a query on
this index (I need document titles, summaries, or snippets and an
anchor (identi
;ll try to write the integration code with
Lucene. It is only a matter of writing a simple InputComponent instance
and this is really trivial (see Nutch's plugin code).
Dawid
William W wrote:
Hi Dawid,
I would like to use Carrot2 with lucene. Do you have examples ?
Thanks a lot,
William.
Fr
Dear all,
I saw a post about an attempt to integrate Carrot2 with Lucene. It was a
while ago, so I'm curious if any outcome has been achieved.
Anyway, as the project coordinator I can offer my help with such
integration; if you're looking for some ready-to-use code then there is
a clustering pl
16 matches
Mail list logo