Lucene indexing

2007-08-30 Thread Madhu
Hi all.. I am trying to index 5Mb excel file ,but while indexing using poi 3..Its giving me out of memory exception. Can any one knows how to index large size excle files files. - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Lucene Indexing

2007-01-24 Thread Sairaj Sunil
Hi all, Can you tell me the exact indexing algorithm used by Lucene. or give some links to the documents that describe the algorithm used by lucene Thanks in advance -- Sairaj Sunil

Lucene Indexing

2011-06-05 Thread Pranav goyal
Hi all, I am a newbie to lucene. I have successfully created my lucene index. But I am not getting how to invalidate previous indexes whenever I add/delete/update any field in my lucene index. Please help me out. for better understanding I have wrote my indexing function : StandardAnalyzer analy

Lucene Indexing

2011-06-06 Thread Pranav goyal
Hi all, Got stuck at a place and not able to think what should I do. I have one structure which I have to index. Let say the structure name is Contract which has a unique Contract_ID. Let say I have 50 contracts which I have to index. Now each contract has let say 100 different keys with their va

lucene indexing

2006-04-07 Thread trupti mulajkar
hi can anyone suggest how to split files using lucene. i am trying to index the TREC collection using lucene-1.4.3 i want lucene to read the multiple files within single TREC file and create an index accordingly. cheers, trupti mulajkar MSc Advanced Computer Science -

Re: Lucene indexing

2007-08-30 Thread Karl Wettin
30 aug 2007 kl. 11.24 skrev Madhu: Hi all.. I am trying to index 5Mb excel file ,but while indexing using poi 3..Its giving me out of memory exception. Can any one knows how to index large size excle files files. Increase the maximum VM heap size? http://blogs.sun.com/watt/resource/jvm-

Lucene indexing error

2007-10-08 Thread Narendra yadala
Hi All I am getting this error when I am doing Indexing using Lucene. java.io.IOException: Access is denied on java.io.WinNTFileSystem.createFileExclusively Please let me know if there is any fix for this bug. Thanks Narendra

lucene indexing doubts

2007-10-25 Thread poojasreejith
d any solution for it. Pooja -- View this message in context: http://www.nabble.com/lucene-indexing-doubts-tf4692435.html#a13412076 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [

Lucene Indexing structure

2008-04-26 Thread Vaijanath N. Rao
Hi Lucene-user and Lucene-dev, I want to use lucene as an backend for the Image search (Content based Image retrieval). Indexing Mechanism: a) Get the Image properties such as Texture Tamura (TT), Texture Edge Histogram (TE), Color Coherence Vector (CCV) and Color Histogram (CH) and Color Co

Lucene indexing pdf

2006-06-27 Thread mcarcelen
Hi, I´m new with Lucene and I´m trying to index a pdf but when I query everything it returns nothing. Can anyone help me? Thans a lot Teresa - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROT

Lucene indexing RDF

2006-06-27 Thread mcarcelen
Hi, Do you know another library for indexing RDF? Thanks a lot for your help Teresa -Mensaje original- De: Suba Suresh [mailto:[EMAIL PROTECTED] Enviado el: martes, 27 de junio de 2006 17:38 Para: java-user@lucene.apache.org Asunto: Re: Lucene indexing pdf I used PDFBox library as

Lucene indexing PPT

2006-06-30 Thread mcarcelen
Hi everybody! I´m trying to build a index with PPT files. I have downloaded the api POI, "poi.bin.3.0" and "poi.src.3.0", but I don´t know where may I have to unzip them. I´d like to build the index by the command line, the same way as > java -cp lucene-core-2.0.0.jar;lucene-demos-2.0.0.jar org.a

Re: Lucene Indexing

2007-01-24 Thread Rajiv Roopan
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html On 1/24/07, Sairaj Sunil <[EMAIL PROTECTED]> wrote: Hi all, Can you tell me the exact indexing algorithm used by Lucene. or give some links to the documents that describe the algorithm used by lucene Thanks in adva

Re: Lucene Indexing

2007-01-25 Thread Sairaj Sunil
Hi I was asking what exactly is the inverted indexing strategy used for storing the index. Is it batch-based index/b-tree based/segment-based data structure that is used as an index data structure. On 1/25/07, Rajiv Roopan <[EMAIL PROTECTED]> wrote: http://lucene.apache.org/java/docs/api/org/

RE: Lucene Indexing

2007-01-26 Thread Damien McCarthy
This document should contain the information you need : http://lucene.sourceforge.net/talks/inktomi/ Damien. -Original Message- From: Sairaj Sunil [mailto:[EMAIL PROTECTED] Sent: 26 January 2007 03:22 To: java-user@lucene.apache.org Subject: Re: Lucene Indexing Hi I was asking what

Re: Lucene Indexing

2007-01-26 Thread Sairaj Sunil
]> wrote: This document should contain the information you need : http://lucene.sourceforge.net/talks/inktomi/ Damien. -Original Message- From: Sairaj Sunil [mailto:[EMAIL PROTECTED] Sent: 26 January 2007 03:22 To: java-user@lucene.apache.org Subject: Re: Lucene Indexing Hi I was asking w

Re: Lucene Indexing

2007-01-26 Thread Grant Ingersoll
te: This document should contain the information you need : http://lucene.sourceforge.net/talks/inktomi/ Damien. -Original Message- From: Sairaj Sunil [mailto:[EMAIL PROTECTED] Sent: 26 January 2007 03:22 To: java-user@lucene.apache.org Subject: Re: Lucene Indexing Hi I was asking wh

[ALFRESCO] - lucene indexing

2014-08-04 Thread Tristan
not sure if i'm in the right place but, looking for help with lucene indexing in alfresco. It looks like indexing is turned on however, i'm specifically having issues with not being able to query values on a custom property in a custom model. I added the index enable on the field but

Re: Lucene Indexing

2011-06-06 Thread Anshum
Hii Pranav, By what you've mentioned, it looks like you want to modify a particular document (or all docs) by adding a particular field in the document(s). As of right now, its not possible to modify a document inside a lucene index. That is due to the way the index is structured. The only way as

Re: Lucene Indexing

2011-06-06 Thread Pranav goyal
Hi Anshum, Thanks for answering my question. By this I got to know that I cannot update without deleting my document. So whenever I am indexing the documents first I need to check whether the particular key exists in the document or not and if it exists I need to delete it and add the updated one

Re: Lucene Indexing

2011-06-06 Thread Anshum
Yes, You'd need to delete the document and then re-add a newly created document object. You may use the key and delete the doc using the Term(key, value). -- Anshum Gupta http://ai-cafe.blogspot.com On Mon, Jun 6, 2011 at 4:45 PM, Pranav goyal wrote: > Hi Anshum, > > Thanks for answering my que

Re: Lucene Indexing

2011-06-07 Thread bmdakshinamur...@gmail.com
If i understand the requirement correctly, contract is one document in your system which in turn contains some *'n*' fields. Contract_ID is the key in all the documents(Structures). Contract_ID is the only field you want to retrieve no matter on what field you search for. If this is the case, store

Lucene indexing & Searching

2011-06-08 Thread Pranav goyal
import java.io.File; import java.io.IOException; import java.util.Collection; import java.util.Iterator; import java.util.List; import java.util.Map; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document;

lucene indexing performance

2005-04-23 Thread Jayakumar.V
Hi, Maybe this query has been answered before. My first email to this user group did not generate any response. I had forwarded it to the following email ids : [EMAIL PROTECTED] java-user@lucene.apache.org This is my second email to this mail id. Hope I've reached the right place. We a

Re: lucene indexing

2006-04-07 Thread Grant Ingersoll
Lucene does not provide this out of the box. You will have to write a program to do it and feed the results to Lucene. If I remember right, these files are in XML, so you can probably use SAX or a pull parser. I think a number of TREC participants, in the past, have used Lucene, so you may

Lucene Indexing DB records?

2008-08-22 Thread ???
Guess I don't quite understand why there are so few posts about Lucene indexing DB records. Searched Markmail, but most of the Lucene+DB posts have to do with lucene index management. The only thing I found so far is the following, if you have a minute or two: http://kalanir.blogspot.com

Re: lucene indexing configuration

2010-08-20 Thread Shuai Weng
Hey, Currently we have indexed some biological full text pages, I was wondering how to config the schema.xml such that the gene names 'met1', 'met2', 'met3' will be treated as different words. Currently they are all mapped to 'met'. Thanks, Shuai

Re: lucene indexing configuration

2010-08-20 Thread Otis Gospodnetic
://search-lucene.com/ - Original Message > From: Shuai Weng > To: java-user@lucene.apache.org > Sent: Fri, August 20, 2010 5:47:31 PM > Subject: Re: lucene indexing configuration > > > Hey, > > Currently we have indexed some biological full text pages, I wa

Re: lucene indexing configuration

2010-08-20 Thread Shuai Weng
://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Shuai Weng To: java-user@lucene.apache.org Sent: Fri, August 20, 2010 5:47:31 PM Subject: Re: lucene indexing configuration Hey, Currently we have indexed some biological

Re: Lucene indexing error

2007-10-08 Thread Karl Wettin
8 okt 2007 kl. 15.58 skrev Narendra yadala: Hi All I am getting this error when I am doing Indexing using Lucene. java.io.IOException: Access is denied on java.io.WinNTFileSystem.createFileExclusively Please let me know if there is any fix for this bug. Please supply the complete stack trace

Re: Lucene indexing error

2007-10-08 Thread Narendra yadala
This is the relevant portion of the stack trace: Caused by: java.io.IOException: Access is denied at java.io.WinNTFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:850) at org.apache.jackrabbit.core.query.lucene.FSDirectory$1.obtain( FSDirect

Re: Lucene indexing error

2007-10-08 Thread Joe Attardi
On 10/8/07, Narendra yadala <[EMAIL PROTECTED]> wrote: > > This is the relevant portion of the stack trace: > > Caused by: java.io.IOException: Access is denied > at java.io.WinNTFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:850) > at or

Re: Lucene indexing error

2007-10-08 Thread Narendra yadala
I do have permission to access Lucene files. They reside on my local machine. But still this is giving the error.I am using Windows XP operationg system. Regards Narendra On 10/8/07, Joe Attardi <[EMAIL PROTECTED]> wrote: > > On 10/8/07, Narendra yadala <[EMAIL PROTECTED]> wrote: > > > > This is

Re: Lucene indexing error

2007-10-08 Thread Joe Attardi
On 10/8/07, Narendra yadala <[EMAIL PROTECTED]> wrote: > > I do have permission to access Lucene files. They reside on my local > machine. > But still this is giving the error.I am using Windows XP operationg > system. > Well, since you are opening an IndexReader (as evidenced by your stack trace)

Re: Lucene indexing error

2007-10-08 Thread Narendra yadala
I think this bug is related to the one posted on Lucene JIRA: http://issues.apache.org/jira/browse/LUCENE-665 Please let me know if there is any solution to this bug of Lucene. Thanks Narendra On 10/8/07, Joe Attardi <[EMAIL PROTECTED]> wrote: > > On 10/8/07, Narendra yadala <[EMAIL PROTECTED]>

Re: Lucene indexing error

2007-10-08 Thread saikrishna venkata pendyala
Lucene creates an lock on the index before using it and then unlock the index, after using it. If the lucene is interuptted and is closed by force the, index remains in locked state and it cannot be used. Generally in linux lucene lock information file is create in /tmp directory. Delete the lock

Re: Lucene indexing error

2007-10-08 Thread Narendra yadala
But then the core problem is that the index that is created is in a totally corrupted state. So deleting or keeping the lock does not make a difference as the Index itself is not created properly. The problem arises when the index is getting created itself. Regards Narendra On 10/8/07, saikrishn

Re: Lucene indexing error

2007-10-08 Thread Chris Hostetter
: I think this bug is related to the one posted on Lucene JIRA: : http://issues.apache.org/jira/browse/LUCENE-665 : Please let me know if there is any solution to this bug of Lucene. note that the issue is "Closed, Resolution: Won't Fix" it was determined that ultimately there was no bug in Luce

Re: Lucene indexing error

2007-10-08 Thread Narendra yadala
Thanks very much for the information. I did not include the other portion of the stack trace because it was totally belonging to Jackrabbit library. Now I guess the problem is due to the fact that Jackrabbit's latest version is using Lucene 2.0 for its indexing purposes. So I will search some patch

Re: lucene indexing doubts

2007-10-25 Thread Karl Wettin
25 okt 2007 kl. 19.35 skrev poojasreejith: Can anyone of you guide me, how to index into an already indexed folder. Right now, I am deleting the indexed info and running the indexer again. I dont want to do that. I want a method, how to append into the same folder when new files are ind

Re: lucene indexing doubts

2007-10-25 Thread poojasreejith
-- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/lucene-indexing-doubts-tf4692435.html#a13420712 Sent from the

Re: lucene indexing doubts

2007-10-26 Thread Karl Wettin
26 okt 2007 kl. 06.31 skrev poojasreejith: I have a folder which contains the indexed files. so, suppose if i want to add one more indexed data into it, without deleting the whole folder and performing the indexing for all the files again. I want it to do only that one file and add the i

Re: lucene indexing doubts

2007-10-26 Thread mark harwood
TED]> To: java-user@lucene.apache.org Sent: Friday, 26 October, 2007 5:31:36 AM Subject: Re: lucene indexing doubts hi, thanks for your response. I think you hanven't got what my question is? I will explain with an example. I have a folder which contains the indexed files. so, suppose i

Re: Lucene Indexing structure

2008-05-02 Thread Chris Hostetter
: Hi Lucene-user and Lucene-dev, Please do not cross post -- java-user is the suitable place for your question. : Obviously there is something wrong with the above approach (as to get the : correct document we need to get all the documents and than do the required : distance calculation), but t

Re: Lucene Indexing structure

2008-05-02 Thread Glen Newton
Vaijanath, I think I would do things in a different fashion: Lucene default distance metric is based on tf/idf and the cosine model, i.e. the frequencies of items. I believe the values that you are adding as Fields are the values in n-space for each of these image-based attributes. I don't believe

Re: Lucene Indexing structure

2008-05-04 Thread Vaijanath N. Rao
Hi Chris, Sorry for the cross-posting and also for not making clear the problem. Let me try to explain the problem at my hand. I am tying to write a CBIR (Content Based Image Reterival) frame work using lucene. As each document have entities such as title, description, author and so on. I a

Re: Lucene Indexing structure

2008-05-04 Thread Grant Ingersoll
Would a Function Query (ValueSourceQuery, see the org.apache.lucene.search.function package) work in this case? -Grant On May 4, 2008, at 9:35 AM, Vaijanath N. Rao wrote: Hi Chris, Sorry for the cross-posting and also for not making clear the problem. Let me try to explain the problem at

Re: Lucene indexing pdf

2006-06-27 Thread Patrick Kimber
Hi Teresa You need to convert the pdf file into text format before adding the text to the Lucene index. You may like to look at http://www.pdfbox.org/ for a library to convert pdf files to text format. Patrick On 27/06/06, mcarcelen <[EMAIL PROTECTED]> wrote: Hi, I´m new with Lucene and I´m t

Re: Lucene indexing pdf

2006-06-27 Thread Suba Suresh
I used PDFBox library as mentioned in Lucene in Action. It works for me. You can access it from www.pdfbox.org suba suresh mcarcelen wrote: Hi, I´m new with Lucene and I´m trying to index a pdf but when I query everything it returns nothing. Can anyone help me? Thans a lot Teresa ---

Re: Lucene indexing RDF

2006-06-27 Thread Suba Suresh
nsaje original- De: Suba Suresh [mailto:[EMAIL PROTECTED] Enviado el: martes, 27 de junio de 2006 17:38 Para: java-user@lucene.apache.org Asunto: Re: Lucene indexing pdf I used PDFBox library as mentioned in Lucene in Action. It works for me. You can access it from www.pdfbox.org suba s

Re: Lucene indexing RDF

2006-06-28 Thread adasal
AIL PROTECTED] > Enviado el: martes, 27 de junio de 2006 17:38 > Para: java-user@lucene.apache.org > Asunto: Re: Lucene indexing pdf > > I used PDFBox library as mentioned in Lucene in Action. It works for me. > You can access it from www.pdfbox.org > > suba suresh > >

Re: Lucene indexing RDF

2006-06-28 Thread Christiaan Fluit
adasal wrote: As far as i have researched this I know that the gnowsis project uses both rdf and lucene, but I have not had time to determine their relationship. www.gnowsis.org/ I can tell you a bit about Gnowsis, as we (Aduna) are cooperating with the Gnowsis people on RDF creation, storage

Re: Lucene indexing RDF

2006-06-29 Thread adasal
Hi Chris, I find this incredibly interesting! Thank you for your full explanation. I was aware of the components, but not the implementation. ... to provide a means to query both document full-text and metadata using an RDF model Is there any thing I can read about how you have some to this ap

Re: Lucene indexing PPT

2006-06-30 Thread Nick Burch
On Fri, 30 Jun 2006, mcarcelen wrote: > I´m trying to build a index with PPT files. I have downloaded the api > POI, "poi.bin.3.0" and "poi.src.3.0", but I don´t know where may I have > to unzip them. I´d like to build the index by the command line, the same > way as I don't know about the lucene

RE: Lucene indexing PPT

2006-06-30 Thread mcarcelen
Hello Nick! Thanks for your help, it´s useful for me Bye -Mensaje original- De: Nick Burch [mailto:[EMAIL PROTECTED] Enviado el: viernes, 30 de junio de 2006 12:19 Para: java-user@lucene.apache.org Asunto: Re: Lucene indexing PPT On Fri, 30 Jun 2006, mcarcelen wrote: > I´m trying

Lucene Indexing on NFS

2012-12-19 Thread Bowden Wise
Hello, I have been getting the following lock error when attempting to open an index writer to add new documents to an index. org.apache.lucene.store.LockObtainFailedException Lock obtain timed out: NativeFSLock@/opt/shared/data/CTXTMNG/PAC_INDEX/lucene/aero/prod/index/write.lock I believe this i

Lucene Indexing performance issue

2014-10-22 Thread Jason Wu
Hi Team, I am a new user of Lucene 4.8.1. I encountered a Lucene indexing performance issue which slow down my application greatly. I tried several ways from google searchs but still couldn't resolve it. Any suggestions from your experts might help me a lot. One of my application uses the l

Re: Lucene indexing & Searching

2011-06-08 Thread Pranav goyal
Oh sry, I got my error and it worked. Thanks On Wed, Jun 8, 2011 at 3:57 PM, Pranav goyal wrote: > import java.io.File; > import java.io.IOException; > import java.util.Collection; > import java.util.Iterator; > import java.util.List; > import java.util.Map; > > import org.apache.lucene.analysi

Re: lucene indexing performance

2005-04-23 Thread Chuck Williams
One immediate optimization would be to only close the writer and open the reader if the document is present. You can have a reader open and do searches while indexing (and optimization) are underway. It's just the delete operation that requires you to close the writer (so you don't have two d

RE: lucene indexing performance

2005-05-16 Thread Jayakumar.V
2005 1:58 AM To: java-user@lucene.apache.org Subject: Re: lucene indexing performance One immediate optimization would be to only close the writer and open the reader if the document is present. You can have a reader open and do searches while indexing (and optimization) are underway. It'

Re: Lucene Indexing DB records?

2008-08-22 Thread Shalin Shekhar Mangar
You might also want to look at Solr and DataImportHandler. http://lucene.apache.org/solr http://wiki.apache.org/solr/DataImportHandler On Fri, Aug 22, 2008 at 2:56 PM, ??? <[EMAIL PROTECTED]> wrote: > Guess I don't quite understand why there are so few posts about Lucene > in

Re: Lucene Indexing DB records?

2008-08-22 Thread Chris Lu
shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Fri, Aug 22, 2008 at 2:26 AM, ??? <[EMAIL PROTECTED]> wrote: > Guess I don't quite understand why there are so few posts about Lucene > indexing DB records. Searched Markmail, but most of the Lucene+DB p

Re: Lucene Indexing DB records?

2008-08-22 Thread Marcelo Ochoa
> Actually there are many projects for Lucene + Database. Here is a list I > know: > > * Hibernate Search > * Compass, (also Hibernate + Lucene) > * Solr + DataImportHandler (Searching + Crawler) > * DBSight, (Specific for database, closed source, but very customizable, > easy to setup) > * Browse

RE: Lucene Indexing DB records?

2008-08-22 Thread John Griffin
Try Hibernate Search - http://www.hibernate.org/410.html John G. -Original Message- From: ??? [mailto:[EMAIL PROTECTED] Sent: Friday, August 22, 2008 3:27 AM To: java-user@lucene.apache.org Subject: Lucene Indexing DB records? Guess I don't quite understand why there are so few

Lucene Indexing and Search Policy

2009-01-21 Thread MSR
Hi, Does Lucene take into consideration anything other than the frequency of the query words in a document? If it does, what are the other considerations? If it is purely based on word frequency, is it appropriate for Internet based search (where we need to consider reference count also)? Th

Lucene Indexing out of memory

2010-03-02 Thread ajay_gupta
ave enough disk space but still I am getting this error.I am not sure even for disk based indexing why its giving this error. I thought disk based indexing will be slow but atleast it will be scalable. Could someone suggest what could be the issue ? Thanks Ajay -- View this message in context: htt

Lucene Indexing and searching - help

2007-07-04 Thread emmettwalsh
n doc; : } : : : : Searching : I have a text field in my app that typically would take in a strings like : : "Main s" , which : would end up in a query like "Main AND s*" like follows : : "B", which would end up in a query like "b*" : :

Lucene indexing for pdf files

2007-08-30 Thread Madhu
Hi all... i am indexing pdf document using pdfbox 7.4, its working fine for some pdf files. for japanese pdf files its giving the below exception. caught a class java.io.IOException with message: Unknown encoding for 'UniJIS-UCS2-H' Can any one help me , how to set the encoding while reading pd

MapReduce usage with Lucene Indexing

2008-01-24 Thread roger dimitri
Hi, I am very new to Lucene & Hadoop, and I have a project where I need to use Lucene to index some input given either as a a huge collection of Java objects or one huge java object. I read about Hadoop's MapReduce utilities and I want to leverage that feature in my case described above.

How reliable is lucene indexing !!

2006-07-23 Thread vasu shah
Hello Everyone, We have an application and the current search is taking lot of time to return the results. We are doing a search against 8-9 database tables and 1.5 million records. I want to increase the search speed and thinking of implementing lucene search. I went through the documentation

newbie lucene indexing/search question

2006-12-28 Thread moraleslos
s and then have a lucene query that looks like this: [book:Guitar paragraph:"learn guitar"]. Will this query return a hit? Thanks in advance! -los -- View this message in context: http://www.nabble.com/newbie-lucene-indexing-search-question-tf2892417.html#a8080965 Sent from the Lucene

Basic Question in Lucene Indexing.

2007-04-12 Thread Lokeya
w.nabble.com/Basic-Question-in-Lucene-Indexing.-tf3566940.html#a9964017 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene Indexing on NFS

2012-12-19 Thread Ian Lea
Use SimpleFSLockFactory. See the javadocs about locks being left behind on abnormal JVM termination. There was a thread on this list a while ago about some pros and cons of using lucene on NFS. 2-Oct-2012 in fact. http://mail-archives.apache.org/mod_mbox/lucene-java-user/201210.mbox/thread -- I

Making lucene indexing multi threaded

2013-09-02 Thread nischal reddy
Hi, I am thinking to make my lucene indexing multi threaded, can someone throw some light on the best approach to be followed for achieving this. I will give short gist about what i am trying to do, please suggest me the best way to tackle this. What am i trying to do? I am building an index

Re: Lucene Indexing and Search Policy

2009-01-21 Thread Anshum
Hi msr, Perhaps this could be useful for you. Lucene implements a modified vector space model in short. http://jayant7k.blogspot.com/2006/07/document-scoringcalculating-relevance_08.html -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the op

Re: Lucene Indexing and Search Policy

2009-01-21 Thread M Seetha Ramaiah
Hi Anshum, Even that document says that higher frequency implied higher score. My doubt is if the score is based only on the frequency, won't it be inappropriate for Internet based search? For example, if Google did the same thing, when I search for "Microsoft", there is a chance that Google

Re: Lucene Indexing and Search Policy

2009-01-21 Thread Anshum
Its about building a custom similarity class that scores using your normalization factors etc. This might help in that case, http://www.gossamer-threads.com/lists/lucene/java-user/69553 -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opini

Re: Lucene Indexing out of memory

2010-03-02 Thread Erick Erickson
0k documents > it also gave OOM error. I have enough disk space but still I am getting > this > error.I am not sure even for disk based indexing why its giving this error. > I thought disk based indexing will be slow but atleast it will be scalable. > Could someone suggest what could b

RE: Lucene Indexing out of memory

2010-03-02 Thread Murdoch, Paul
cene.apache.org ] On Behalf Of ajay_gupta Sent: Tuesday, March 02, 2010 8:28 AM To: java-user@lucene.apache.org Subject: Lucene Indexing out of memory Hi, It might be general question though but I couldn't find the answer yet. I have around 90k documents sizing around 350 MB. Each document con

RE: Lucene Indexing out of memory

2010-03-02 Thread Murdoch, Paul
-paul.b.murdoch=saic@lucene.apache.org ] On Behalf Of ajay_gupta Sent: Tuesday, March 02, 2010 8:28 AM To: java-user@lucene.apache.org Subject: Lucene Indexing out of memory Hi, It might be general question though but I couldn't find the answer yet. I have around 90k documents sizing around 3

Re: Lucene Indexing out of memory

2010-03-02 Thread ajay_gupta
disk space but still I am getting >> this >> error.I am not sure even for disk based indexing why its giving this >> error. >> I thought disk based indexing will be slow but atleast it will be >> scalable. >> Could so

Re: Lucene Indexing out of memory

2010-03-02 Thread Ian Lea
; documents >>> it also gave OOM error. I have enough disk space but still I am getting >>> this >>> error.I am not sure even for disk based indexing why its giving this >>> error. >>> I thought disk based indexing will be slow but atleast it will be >>

Re: Lucene Indexing out of memory

2010-03-02 Thread Erick Erickson
;> error > >> so I thought to use FSDirectory method but surprisingly after 70k > >> documents > >> it also gave OOM error. I have enough disk space but still I am getting > >> this > >> error.I am not sure even for disk b

Re: Lucene Indexing out of memory

2010-03-03 Thread ajay_gupta
t;> fields "word" and "context" and add these two fields with values as >>>> word >>>> value and context value. >>>> >>>> I tried this in RAM but after certain no of docs it gave out of memory >>>>

Re: Lucene Indexing out of memory

2010-03-03 Thread Ian Lea
d the new context >>>>> and >>>>> update the document. In case no context exist I create a document with >>>>> fields "word" and "context" and add these two fields with values as >>>>> word >>>>> value and

Re: Lucene Indexing out of memory

2010-03-03 Thread Erick Erickson
> and > >>>>> for each word in that document I am appending fixed number of > >>>>> surrounding > >>>>> words. To do that first I search in existing indices if this word > >>>>> already > >

Re: Lucene Indexing out of memory

2010-03-03 Thread Michael McCandless
5254-paul.b.murdoch=saic@lucene.apache.org > [mailto:java-user-return-45254-paul.b.murdoch=saic@lucene.apache.org > ] On Behalf Of ajay_gupta > Sent: Tuesday, March 02, 2010 8:28 AM > To: java-user@lucene.apache.org > Subject: Lucene Indexing out of memory > > > Hi, &g

Re: Lucene Indexing out of memory

2010-03-03 Thread ajay_gupta
e. >> >> http://stackoverflow.com/questions/1362460/why-does-lucene-cause-oom-whe >> n-indexing-large-files >> >> Paul >> >> >> -Original Message- >> From: java-user-return-45254-paul.b.murdoch=saic@lucene.apache.org >> [mailto

Re: Lucene Indexing out of memory

2010-03-03 Thread Erick Erickson
gt; > more details about the docs, or, some code fragments, could help shed > > light. > > > > Mike > > > > On Tue, Mar 2, 2010 at 8:47 AM, Murdoch, Paul > > wrote: > >> Ajay, > >> > >> Here is another thread I started on the same issue. > >

Re: Lucene Indexing out of memory

2010-03-03 Thread ajay_gupta
ument -- it must flush after the doc has been fully >> > indexed. >> > >> > This past thread (also from Paul) delves into some of the details: >> > >> > http://lucene.markmail.org/thread/pbeidtepentm6mdn >> > >> > But it's not clear whe

Re: Lucene Indexing out of memory

2010-03-04 Thread Ian Lea
rameters smaller values as well but nothing worked. >>> >>> Any hint will be very helpful. >>> >>> Thanks >>> Ajay >>> >>> >>> Michael McCandless-2 wrote: >>> > >>> > The worst case RAM usage for Lucene i

Re: Lucene Indexing out of memory

2010-03-04 Thread Michael McCandless
ystem.gc() to release >>>> memory >>>> and I also tried various other parameters like >>>> context_writer.setMaxBufferedDocs() >>>> context_writer.setMaxMergeDocs() >>>> context_writer.setRAMBufferSizeMB() >>>> I set these parame

Re: Lucene Indexing out of memory

2010-03-14 Thread ajay_gupta
n of this method and I >>>>> observed that after each call of update_context memory increases and >>>>> when >>>>> it >>>>> reaches around 65-70k it goes outofmemory so somewhere memory is >>>>> increasing >>>>>

Re: Lucene Indexing out of memory

2010-03-14 Thread ajay_gupta
ve around 90k documents sizing around 350 MB. Each document >> contains >> >>>>> a >> >>>>> record which has some text content. For each word in this text I >> want >> >>>>> to >> >>>>> store context for that word a

Re: Lucene Indexing out of memory

2010-03-15 Thread Michael McCandless
at 8:27 AM, ajay_gupta >>> wrote: >>> >>>> >>> >>>>> >>> >>>>> Hi, >>> >>>>> It might be general question though but I couldn't find the answer >>> yet. >>> >>>>> I &g

Re: Lucene Indexing and searching - help

2007-07-04 Thread Erick Erickson
ll tokenise our long string : doc.add(field2); : : return doc; : } : : : : Searching : I have a text field in my app that typically would take in a strings like : : "Main s" , which : would end up in a query like "Main AND s*" like follows : : "B", which would end u

Re: Lucene Indexing and searching - help

2007-07-05 Thread emmettwalsh
0;i(Hits.java:42) at org.apache.lucene.search.Searcher.search(Searcher.java:45) at org.apache.lucene.search.Searcher.search(Searcher.java:37) at com.sodaitsolutions.instantsearch.model.PropertyDatabaseImpl.search(PropertyDatabaseImpl.java:306) -- View this message in contex

Re: Lucene indexing for pdf files

2007-08-31 Thread Steven Rowe
Hi Madhu, Madhu wrote: > i am indexing pdf document using pdfbox 7.4, its working fine for some pdf > files. for japanese pdf files its giving the below exception. > > caught a class java.io.IOException > with message: Unknown encoding for 'UniJIS-UCS2-H' > > Can any one help me , how to set th

Re: How reliable is lucene indexing !!

2006-07-23 Thread karl wettin
On Sun, 2006-07-23 at 14:44 -0700, vasu shah wrote: > I have few doubts > The index size will approximately increase by 4000 records per > day. Is lucene good for the application? Sure. > Is it suitable for frequent inserts/updates? Sure, but I don't consider 4000 new documents per day to be

Re: How reliable is lucene indexing !!

2006-07-24 Thread vasu shah
Thank you very much for the quick response. I was just a little skeptical about Lucene for my application. This user forum is really supportive by posting the replies immediately. Thanks, -Vasu karl wettin <[EMAIL PROTECTED]> wrote: On Sun, 2006-07-23 at 14:44 -0700, vasu shah wro

  1   2   3   >