How to export lucene index to a simple text file?

2010-09-21 Thread Sahin Buyrukbilen
Hi, I am currently working on a project about private information retrieval and I need to have an inverted index file in txt format as follows: Term tfreq t Inverted list for t - and 1 <6, 0.159> bi

Re: How to export lucene index to a simple text file?

2010-09-21 Thread Sahin Buyrukbilen
eier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message- > > From: Sahin Buyrukbilen [mailto:sahin.buyrukbi...@gmail.com] > > Sent: Tuesday, September 21, 2010 9:12 AM > > To: java-user@lucene.apache.org >

how to get the first term from index?

2010-09-30 Thread Sahin Buyrukbilen
Hi all, I need to get the first term in my index and iterate it. Can anybody help me? Best.

Re: how to get the first term from index?

2010-09-30 Thread Sahin Buyrukbilen
; > -- > Anshum Gupta > http://ai-cafe.blogspot.com > > > On Thu, Sep 30, 2010 at 11:54 PM, Sahin Buyrukbilen < > sahin.buyrukbi...@gmail.com> wrote: > > > Hi all, > > > > I need to get the first term in my index and iterate it. Can anybody help > > me? > > > > Best. > > >

How to get the score of a term in a document?

2010-10-01 Thread Sahin Buyrukbilen
Hi all, I need to retrieve the score of a term in a document? I dont want to play different scoring schemes. I just checked my index with Luke and it shows me a score for each term in each document the term exists. So, I need just to get that score. Can anybody help me? Thank you in advance. Sa

Re: How to get the score of a term in a document?

2010-10-01 Thread Sahin Buyrukbilen
d you elaborate on what you're trying to do? If you describe the > problem > you're trying to solve, people can provide better answers. > > Best > Erick > > On Fri, Oct 1, 2010 at 11:33 AM, Sahin Buyrukbilen < > sahin.buyrukbi...@gmail.com> wrote: > > &

Re: How to get the score of a term in a document?

2010-10-02 Thread Sahin Buyrukbilen
core = s.score(); > } > > I think? > > Mike > > On Fri, Oct 1, 2010 at 11:49 PM, Sahin Buyrukbilen > wrote: > > Hi Erick, > > > > I mean the score of a term in a document (we can think this as a one word > > query) which is calculated by using &

Re: How to get the score of a term in a document?

2010-10-02 Thread Sahin Buyrukbilen
} catch(IOException ex){} } } On Sat, Oct 2, 2010 at 9:42 AM, Sahin Buyrukbilen < sahin.buyrukbi...@gmail.com> wrote: > Hi Mike, > > I am already done with walking through the terms, frequencies and the docs > by using termenum, termdocs, and indexreader,.

how to index large number of files?

2010-10-19 Thread Sahin Buyrukbilen
Hi all, I have to index about 4.5Million txt files. When I run the my indexing application through Eclipse, I get this error : "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space" eclipse -vmargs -Xmx2000m -Xss8192k eclipse -vmargs -Xms40M -Xmx2G I tried running Eclipse wit

Re: how to index large number of files?

2010-10-19 Thread Sahin Buyrukbilen
Thank you Johnbin, do you know which parameter I have to play with? On Wed, Oct 20, 2010 at 12:59 AM, Johnbin Wang wrote: > I think you can write index file once every 10,000 files or less have been > read. > > On Wed, Oct 20, 2010 at 12:11 PM, Sahin Buyrukbilen < > sahin.buy

Re: how to index large number of files?

2010-10-20 Thread Sahin Buyrukbilen
among all the index thread. > > Hope it's helpful. > > > > On Wed, Oct 20, 2010 at 1:05 PM, Sahin Buyrukbilen < > sahin.buyrukbi...@gmail.com> wrote: > > > Thank you Johnbin, > > do you know which parameter I have to play with? > > > > On

Re: how to index large number of files?

2010-10-20 Thread Sahin Buyrukbilen
ed, Oct 20, 2010 at 2:39 PM, Qi Li wrote: > 1. What is the difference when you used different vm parameters? > 2 What merge policy and optimization strategy did you use? > 3. How did you use the commit or flush ? > > Qi > > On Wed, Oct 20, 2010 at 2:05 PM, Sahin Buyrukbilen

Re: how to index large number of files?

2010-10-20 Thread Sahin Buyrukbilen
etRamBufferSizeMB. > > One thing I'd be interested in is how big your files are. It might be, are > you trying to process a humongous file when it blows? > > And if none of that helps, please post your stack trace. > > Best > Erick > > On Wed, Oct 20, 2010 a

Re: how to index large number of files?

2010-10-20 Thread Sahin Buyrukbilen
by the ways file size is not big. mostly 1kB. I am working on wikipedia articles in txt format. On Wed, Oct 20, 2010 at 11:01 PM, Sahin Buyrukbilen < sahin.buyrukbi...@gmail.com> wrote: > Unfortunately both methods didnt go through. I am getting memory error even > at reading t

Re: how to index large number of files?

2010-10-21 Thread Sahin Buyrukbilen
minutes now, but I am happy it works. After this I will try Toke's method. Create 100.000 filed folders and try to index them recursively. On Thu, Oct 21, 2010 at 4:57 AM, Toke Eskildsen wrote: > On Thu, 2010-10-21 at 05:01 +0200, Sahin Buyrukbilen wrote: > > Unfortunately both me

Re: how to index large number of files?

2010-10-22 Thread Sahin Buyrukbilen
uments' of your Debug or Run > configuration? > > Peter > > On Thu, Oct 21, 2010 at 3:26 PM, Sahin Buyrukbilen < > sahin.buyrukbi...@gmail.com> wrote: > > > I dont know why I am getting this error, but it looks normal to me now. > > because when I try to