Hi
Look at the apache mohaut project (based on hadoop ). It seems you need
machine learning algorithms.
Best Regards
Alexander Aristov
On 17 August 2011 20:39, Ian Lea wrote:
> Certainly sounds doable in lucene. Is it basically working apart from
> false positives? Can you giv
indeed. What's the problem?
Best Regards
Alexander Aristov
On 16 May 2011 16:43, Uwe Schindler wrote:
> The problem we have is that you dont really say what your problem is! Whats
> wrong with the search results you get? Give an example of the query you
> execute and the docu
did you check it
http://wiki.apache.org/solr/Deduplication
Best Regards
Alexander Aristov
On 10 March 2011 18:35, Mark wrote:
> My understanding is It can mark documents with the same signature
> indicating that they are similar however there is no way at query time to
> retu
use PDFBOX for the purpose. It can create PDF docs. But you will have to do
it yourself.
And Gong is right, it's not write place asking such questions
Best Regards
Alexander Aristov
On 20 February 2011 02:16, Simon Willnauer
wrote:
> Hi Gong Li,
>
> your question is out of scop
your search engine would extract text content from a PDF file and all
markup, pictures etc would be lost. and so when you search you would get
only text, highlighted or not.
Best Regards
Alexander Aristov
On 18 February 2011 21:29, Gong Li wrote:
> Hi,
>
> I am developing a PDF sear
Set correct classpath. you may want to compare which libraries eclipse
includes in classpath.
Best Regards
Alexander Aristov
On 25 January 2011 18:12, Alex vB wrote:
>
> Hello everybody,
>
> I used a small indexing example from "Lucene in Action" and can run and
>
Did you see solr? It's a web app.
Best Regards
Alexander Aristov
On 8 December 2010 09:36, Chris Lu wrote:
> You can use DBSight for database search. You just need to give it one or
> several SQLs. And you can generate search result template, and it will
> manage the index for
anyway even if you get correct whitespaces and new lines this won't affect
indexing.
Best Regards
Alexander Aristov
On 3 December 2010 10:00, Lance Norskog wrote:
> The text should come out as a stream of words with space, but without
> any of the formatting in the PDF. Extract
are there such events in Russia?
Best Regards
Alexander Aristov
On 20 July 2010 17:59, Ivan Provalov wrote:
> We are organizing a meetup in michigan on IR. The first meeting is on
> august 19. We will be talking about lucene's scalability and relevance
> tuning followed b
I suspect that you might use incompatible versions of lucene and kaffe.
Though I have never worked with kaffe before and so might be wrong.
Best Regards
Alexander Aristov
2009/8/11 石川
> Hi,
> I am a newbie in lucene and am trying the 'indexing and searching'
> demo of
you will need to develop parser and indexer.
but remember that in current implementation content is not stored in lucene
index,
indexed - yes nut not stored.
Best Regards
Alexander Aristov
2009/5/28 Gaurav Kumar
> Hi everyone,
>
> I am doing a project using Lucene where i need to i
As I said try Solr. It has web interface which can give you ideas. There is
no reason to complete it here in email. If you know Java you will quickly
catch up with.
Best Regards
Alexander Aristov
2009/5/26 StanleyTan
>
>
> Hi Alexander,
>
> thanks for your advise. but im th
>From my point of view the best option for you would be using Solr. You can
integrate it with any web app/html page
Alexander
2009/5/26 StanleyTan
>
> hi all,
>
> i'm new to lucene. will like to ask, am i able to integrate lucene search
> into my normal html pages, not web applications? meanin
..
>
> I'd appreciate your help
>
>
>
>
--
Best Regards
Alexander Aristov
ying scope(path).
> Also is there any direct support for Facets building in Lucene?
>
> Regards,
> Mayur
>
> -----
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
--
Best Regards
Alexander Aristov
arsing MSWord
>
> Hello,
> I wanted to know if there are classes in Lucene that support parsing MSWord
> documents.
> Many thanks,
> Dipesh
>
> --------
> "Help Ever Hurt Never"- Baba
>
--
Best Regards
Alexander Aristov
Hi all,
Just wonder if Nutch takes into consideration rules from the robots.txt file
while crawling a site.
--
Best Regards
Alexander Aristov
not be so performance intense.
>
> Does this sound like a reasonable solution? Of course this means creating
> a lot of IndexReaders/Writers but I prefer that to searching in a huge index
> everytime when a user only wants to search in a slice of the total index.
>
> Best Regards,
> Tobias Larsson Hult
>
>
--
Best Regards
Alexander Aristov
dexing-Error-tp19385951p19385951.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
Best Regards
Alexander Aristov
archive at Nabble.com.
>
>
> -----
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
Best Regards
Alexander Aristov
- Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
--
Best Regards
Alexander Aristov
21 matches
Mail list logo