Short summary: * If I could make Solr merge oldest segments (or the one with the most deleted docs) rather than smallest segments; I think I'd almost never need "optimize".
* Can I tell Solr to do this? Or if not, can someone point me in the right direction regarding where I might patch it to try this myself? I have a system where documents are refreshed and/or expired pretty much in a FIFO manner. In particular, no document in the system can live for over 1 month. Without frequent optimizes, ISTM my indexes tend to get bloated with mostly deleted content. I attached a ls-l below - showing the largest segments in my index are all from July. A query of timestamp:([1999-01-01T00:00:00Z TO 2010-08-01T23:59:59Z]) returns no documents so it appears to me the first 2 segments are entirely filled with deleted documents. I imagine this is not too uncommon a situation -- for example a web-crawler that periodically updates web pages that contain some dynamic content. Perhaps a different good criteria would be selecting to merge the segments with the largest number of deleted documents. In my case it'd be the same; but I could imagine non-FIFO update-heavy systems where that would work better. $ ls -lrt *.fdt -rw-rw-r-- 1 ramayer ramayer 291490823897 Jul 20 21:34 _u63.fdt -rw-rw-r-- 1 ramayer ramayer 78251326159 Jul 29 18:15 _xkh.fdt -rw-rw-r-- 1 ramayer ramayer 69295141685 Aug 8 01:29 _10f5.fdt -rw-rw-r-- 1 ramayer ramayer 5406369697 Aug 10 21:14 _13fv.fdt -rw-rw-r-- 1 ramayer ramayer 66210508029 Aug 10 21:44 _13g1.fdt -rw-rw-r-- 1 ramayer ramayer 2001873014 Aug 10 23:05 _13io.fdt -rw-rw-r-- 1 ramayer ramayer 1578531820 Aug 11 14:10 _13m8.fdt -rw-rw-r-- 1 ramayer ramayer 2254917604 Aug 12 03:49 _13p3.fdt -rw-rw-r-- 1 ramayer ramayer 2890967852 Aug 12 06:49 _13s6.fdt -rw-rw-r-- 1 ramayer ramayer 2820285238 Aug 12 09:49 _13v9.fdt -rw-rw-r-- 1 ramayer ramayer 2905550377 Aug 12 12:52 _13yc.fdt -rw-rw-r-- 1 ramayer ramayer 2776837514 Aug 12 15:54 _141f.fdt -rw-rw-r-- 1 ramayer ramayer 259698816 Aug 12 16:15 _141p.fdt -rw-rw-r-- 1 ramayer ramayer 290083173 Aug 12 16:34 _1420.fdt -rw-rw-r-- 1 ramayer ramayer 279500106 Aug 12 16:54 _142b.fdt -rw-rw-r-- 1 ramayer ramayer 277156197 Aug 12 17:17 _142m.fdt -rw-rw-r-- 1 ramayer ramayer 91360010 Aug 13 00:27 _142x.fdt -rw-rw-r-- 1 ramayer ramayer 7351514 Aug 13 00:37 _142y.fdt -rw-rw-r-- 1 ramayer ramayer 7286 Aug 13 00:38 _142z.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 01:07 _1430.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 02:07 _1431.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 03:07 _1432.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 04:07 _1433.fdt -rw-rw-r-- 1 ramayer ramayer 2388369 Aug 13 04:35 _1434.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 05:07 _1435.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 06:07 _1436.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 07:07 _1437.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 08:07 _1438.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 09:07 _1439.fdt -rw-rw-r-- 1 ramayer ramayer 21 Aug 13 10:07 _143a.fdt -rw-rw-r-- 1 ramayer ramayer 198581 Aug 13 11:04 _143b.fdt