date:20081229

Re: Field Not Present In Document

2008-12-29 Thread Amin Mohammed-Coleman

Hi Thanks for your reply. It turns out you were correct and I was not loading the correct document. User error! Cheers Amin On 28 Dec 2008, at 19:50, Grant Ingersoll wrote: How do you know that document in question has an id of 1, as in when you do: Document documentFromIndex = index

Re: Re: IndexCommit#getFileNames() returning duplicates?

2008-12-29 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: IndexCommit#getFileNames() returning duplicates?

2008-12-29 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

IndexCommit#getFileNames() returning duplicates?

2008-12-29 Thread Shalin Shekhar Mangar

Hello, Solr uses IndexCommit#getFileNames() to get a list of files for replication. One windows user reported an exception which looks like it may have been caused by IndexCommit#getFileNames() returning duplicate file names. The exception in his case was caused by "_21e.tvx" coming more than once

Re: duplication checking while indexing

2008-12-29 Thread liu Ivan

I use JDBM store document's key ID. 2008/12/30 Chris Lu > Otis, thanks for the pointer. > I think the question can be: > > How to access TermEnum or TermInfos during indexing. > > If this is possible, things would be easier. > > -- > Chris Lu > - > Instant Scalable Full-

Re: duplication checking while indexing

2008-12-29 Thread Chris Lu

Otis, thanks for the pointer. I think the question can be: How to access TermEnum or TermInfos during indexing. If this is possible, things would be easier. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: h

Re: Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Grant Ingersoll

On Dec 29, 2008, at 11:25 AM, Girish Naik wrote: Thanks Grant I will check this out. BTW, as far as Lucene version is concerned I had checked out the svn of lucene and created a build its version says as 2.9 :) . And Luke is of version 0.9.1 You will need to plug in your own Lucene jar

Re: duplication checking while indexing

2008-12-29 Thread Otis Gospodnetic

Chris, Mark Miller & Co. are working on (Near) Duplicate Detection. I think the work is in Solr's JIRA, but some of it might be applicable to Lucene. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Chris Lu > To: "java-user@lucene.apach

Re: Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Girish Naik

Thanks Grant I will check this out. BTW, as far as Lucene version is concerned I had checked out the svn of lucene and created a build its version says as 2.9 :) . And Luke is of version 0.9.1 Regards, Please do not print this email unless it is absolutely necessary.

Re: Where to get login details for Luke

2008-12-29 Thread Aaron Schon

It is just File | Open Lucene Index :) - Original Message From: Erick Erickson To: java-user@lucene.apache.org Sent: Monday, December 29, 2008 11:05:01 AM Subject: Re: Where to get login details for Luke Ummm, I don't understand the question. You don't need to login, Luke is a stand-

Re: Where to get login details for Luke

2008-12-29 Thread Erick Erickson

Ummm, I don't understand the question. You don't need to login, Luke is a stand-alone program for examining Lucene indexes. You *do* have to point Luke at your index, there should be some choice about opening a file. I don't have Luke in front of me here at home, but poke around the menus and it sh

Re: Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Grant Ingersoll

On Dec 29, 2008, at 9:59 AM, Girish Naik wrote: FIELD_BODY is defined as public static final String FIELD_BODY = "AVS_FIELD_BODY"; and its indexed as ParsedDoc webdoc = ParsedDoc.getDoc(page); ... document.add(new Field(Constants.FIELD_BODY, webdoc.getContents(), Field.Store.NO, Field.Index.

Re: Payloads

2008-12-29 Thread Peter Keegan

Hi Karl, I use payloads for weight only, too, with BoostingTermQuery (see: http://www.nabble.com/BoostingTermQuery-scoring-td20323615.html#a20323615) A custom tokenizer looks for the reserved character '\b' followed by a 2 byte 'boost' value. It then creates a special Token type for a custom filt

Where to get login details for Luke

2008-12-29 Thread NageswaraRao M

Hi Guys, Can you Please tell me where to get login details for Luke Thanks Nagesh

Re: Re: Re: Payloads

2008-12-29 Thread Greg Shackles

That sounds pretty cool Karl, and I also dig your use of Motorhead as an example : ) I recently built an application where payloads were a lifesaver, but my usage of them is pretty basic. I am indexing pages of text, so I use payloads to store metadata about each word on the page - size, color, r

Re: Re: Re: Payloads

2008-12-29 Thread Greg Shackles

That sounds pretty cool Karl, and I also dig your use of Motorhead as an example : ) I recently built an application where payloads were a lifesaver, but my usage of them is pretty basic. I am indexing pages of text, so I use payloads to store metadata about each word on the page - size, color, r

Re: Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Grant Ingersoll

What does the FIELD_BODY look like? You search is apparently going against that Field, but you don't show how it is indexed. Have you looked at your index in Luke yet? http://www.getopt.org/luke? On Dec 29, 2008, at 8:19 AM, Girish Naik wrote: Sorry for that, Here is how the Analyzer is

Re: Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Girish Naik

Sorry for that, Here is how the Analyzer is Selected: public static Analyzer getAnalyzerInstance(String localeKey) { Analyzer analyzer = null; if (localeKey == null || localeKey.trim().equals("")) { localeKey = AppContext.getSetting("defaultLocale"); System.out.println

Re: Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Grant Ingersoll

Hi Girish, Can you provide some sample code and info about what isn't working? All you have said so far is that the Arabic Analyzer doesn't work for you, but you have said nothing about how you are actually using it. Are you getting exceptions? Do the tokens not look right? Are no res

Help in Arabic Analysers with Lucene on Windows

2008-12-29 Thread Girish Naik

Hi, I am having a hard time in indexing the Arabic content and searching the same via Lucene. I have also used a Arabic Analyzer from the Lucene package but had no luck. I have also used a snowball jar but it doesn't contain an Arabic stemmer. So i had put the Lucene Arabic Stemmer in snowb

duplication checking while indexing

2008-12-29 Thread Chris Lu

I am wondering whether there is an easy way to avoid duplication while indexing, just using the index being created, without creating other data structures. In some cases, the incoming document list can have duplicates. For example, when creating spell checking indexes for phrases. Each phrase is o

Re: Re: Re: Proximity Search between phrases

2008-12-29 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Re: Proximity Search between phrases

2008-12-29 Thread tom

AUTOMATIC REPLY LUX is closed until 5th January 2009 - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Proximity Search between phrases

2008-12-29 Thread Cool The Breezer

You could you phrase queries also like "Economic Meltdown" AND "Asian Countries". but these phrases may be too distant from one another to be relevant for your searching purposes. To get better result wrt position(distance between phrases), you can use SpanNearQuery. Let me know if you need mo

Re: Field Not Present In Document

Re: Re: IndexCommit#getFileNames() returning duplicates?

Re: IndexCommit#getFileNames() returning duplicates?

IndexCommit#getFileNames() returning duplicates?

Re: duplication checking while indexing

Re: duplication checking while indexing

Re: Help in Arabic Analysers with Lucene on Windows

Re: duplication checking while indexing

Re: Help in Arabic Analysers with Lucene on Windows

Re: Where to get login details for Luke

Re: Where to get login details for Luke

Re: Help in Arabic Analysers with Lucene on Windows

Re: Payloads

Where to get login details for Luke

Re: Re: Re: Payloads

Re: Re: Re: Payloads

Re: Help in Arabic Analysers with Lucene on Windows

Re: Help in Arabic Analysers with Lucene on Windows

Re: Help in Arabic Analysers with Lucene on Windows

Help in Arabic Analysers with Lucene on Windows

duplication checking while indexing

Re: Re: Re: Proximity Search between phrases

Re: Re: Proximity Search between phrases

Re: Proximity Search between phrases

24 matches

Site Navigation

Mail list logo

Footer information