There is.  Call the Optimize() function on the index.

You should never delete index files manually unless if you know what you are
doing otherwise you can corrupt / destroy your index.

-- George 

> -----Original Message-----
> From: Nic Wise [mailto:nic.w...@bbc.com] 
> Sent: Tuesday, January 13, 2009 6:36 AM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Lucene Scalability Options
> 
> I'm SURE there is a cleaner way, but in the past, we read the 
> segments file (manually :( ), and any file which wasn't 
> listed in there was considered to be a redundant file.
> 
> Worked for us. There may be a way to ask a IndexReader which 
> files it's using, and then extrapolate from there, but we 
> were using Lucene.net 1.something, which didn't.
> 
> I think that's what luke does. Opens the index, asks Lucene 
> whats it's using, kills everything else.
> 
> -----Original Message-----
> From: Nitin Shiralkar [mailto:nit...@coreobjects.com]
> Sent: 13 January 2009 11:26
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Lucene Scalability Options
> 
> Hi All,
> 
> I have started this thread for Lucene scalability aspect. I 
> have an index with 80 GB size. However it looks like many of 
> the segment files are either redundant or unused. Even if I 
> delete them and just retain CFS, segments and deletable 
> files, the index seems to be working fine.
> However I want to know more cleaner approach to identify such 
> redundant/unused files through APIs. I am able to see these 
> unused files in Luke as "Deletable". However I am not sure 
> how Luke is able to identify unused files. I am using 
> Lucene.NET 2.0 version.
> 
> Can you please suggest some way?
> 
> 
> 
> -----Original Message-----
> From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com]
> Sent: Tuesday, January 13, 2009 1:01 AM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Lucene Scalability Options
> 
> 
> Floyd, you will need to provide more details about the 
> specific problems you are encountering.
> 
> I made a quick check, and have no difficulty opening and 
> inspecting an index I created a few minutes ago with 
> Lucene.NET v2.3.1 using Luke v0.9.1.
> 
> -- Neal
> 
> 
> -----Original Message-----
> From: Floyd Wu [mailto:floyd...@gmail.com]
> Sent: Friday, January 09, 2009 8:18 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Lucene Scalability Options
> 
> Hi all,
> It seems new version of Luke is not compitable with 
> Lucene.net and I've email to the creator of Luke. Below is 
> feedback from him
> 
> "Yes, there have been many changes,
> but Lucene 2.4 can still open indexes built with earlier 
> versions of Lucene/Java.
> This is the second report I've got about the possible 
> incompatibility with Lucene.Net - I suggest to raise up this 
> issue on the Lucene mailing list ( 
> java-...@lucene.apache.org), and provide more details, eg. 
> Lucene.Net revision, stack trace, a small sample index if you can."
> 
> My original report as below
> "The situation is Luke-0.9 can not open the index files which 
> built by Lucene.Net-2.3.1.
> I tried to use older version of Luke and confirm Luke-0.8 and 
> Luke-0.8.1 can open and read index files fine.
>  I wonder if there is any change between java Lucene 2.3 and 2.4.
> Please help on this."
> 
> Floyd
> 
> 
> 
> 2009/1/9 George Aroush <geo...@aroush.net>
> 
> > Hi Nitin,
> >
> > Any optimization that Luke can do on an index is also 
> doable by making
> API
> > calls from Lucene.Net.  If not, then there is either a bug in
> Lucene.Net or
> > in your use of the API.  Can you share with us your API 
> calls as well
> as
> > the
> > Lucene.Net version you are using?
> >
> > Thanks.
> >
> > -- George
> >
> > > -----Original Message-----
> > > From: Nitin Shiralkar [mailto:nit...@coreobjects.com]
> >  > Sent: Friday, January 09, 2009 6:27 AM
> > > To: lucene-net-user@incubator.apache.org
> > > Subject: RE: Lucene Scalability Options
> > >
> > > Thanks Hugh. Yes, I tried using Luke for index optimization.
> > > Surprisingly, it has brought down the index size to ~20 
> GB with only 
> > > one CFS and segment files left behind. I used compound 
> optimization 
> > > option. But I use the similar "SetUseCompoundFile" property on 
> > > "IndexModifier" object in my Lucene.NET code, but it has 
> no effect 
> > > on size or files after optimization. Any suggestions??
> > >
> > >
> > > -----Original Message-----
> > > From: Hugh Spiller [mailto:hugh.spil...@renishaw.com]
> > > Sent: Friday, January 09, 2009 3:35 PM
> > > To: lucene-net-user@incubator.apache.org
> > > Subject: RE: Lucene Scalability Options
> > >
> > > Hi Nitin,
> > >
> > > I've found the easiest way to get rid of redundant files 
> in an index 
> > > is to use Luke. As soon as you use it to open the index, 
> it tidies 
> > > up all the cruft.
> > >
> > > It's at http://www.getopt.org/luke/ .
> > >
> > > ________________________________
> > >
> > > Hugh Spiller
> > >
> > >
> > > -----Original Message-----
> > > From: Nitin Shiralkar [mailto:nit...@coreobjects.com]
> > > Sent: 09 January 2009 08:48
> > > To: lucene-net-user@incubator.apache.org
> > > Subject: RE: Lucene Scalability Options
> > >
> > > -- snip --
> > >
> > >
> > > Any inputs on junk/redundant files in above list?
> > >
> > >
> > >
> > > --------------------------------------------------------------
> > > ------------------------------------
> > > This email and any attachments are confidential and are 
> for the use 
> > > of the addressee only. If you are not the addressee, you must not 
> > > use or disclose the contents to any other person. Please 
> immediately 
> > > notify the sender and delete the email. Statements and opinions 
> > > expressed here may not represent those of the company. Email 
> > > correspondence is monitored by the company. This 
> information may be 
> > > subject to Export Control Regulation. You are obliged to 
> comply with 
> > > such Regulations
> > >
> > > The parent company of the Renishaw Group is Renishaw plc, 
> registered 
> > > in England no. 1106260. Registered Office: New Mills, 
> > > Wotton-under-Edge, Gloucestershire, GL12 8JR, United Kingdom. Tel 
> > > +44 (0) 1453 524524
> > > --------------------------------------------------------------
> > > ------------------------------------
> > >
> >
> > 
> This e-mail (and any attachments) is confidential and may 
> contain personal views which are not the views of the BBC 
> unless specifically stated. If you have received it in error, 
> please delete it from your system. Do not use, copy or 
> disclose the information in any way nor act in reliance on it 
> and notify the sender immediately.
>  
> Please note that the BBC monitors e-mails sent or received. 
> Further communication will signify your consent to this
> 
> This e-mail has been sent by one of the following 
> wholly-owned subsidiaries of the BBC:
>  
> BBC Worldwide Limited, Registration Number: 1420028 England, 
> Registered Address: BBC Media Centre, 201 Wood Lane, London, 
> W12 7TQ BBC World News Limited, Registration Number: 04514407 
> England, Registered Address: Woodlands, BBC Media Centre, 201 
> Wood Lane, London, W12 7TQ BBC World Distribution Limited, 
> Registration Number: 04514408, Registered Address: Woodlands, 
> BBC Media Centre, 201 Wood Lane, London, W12 7TQ
> 

Reply via email to