Re: One index or multiple?

2012-10-07 Thread Billy Newman
Walter, Thanks! You bring up a very important 'commit' problem which I had not thought about. So I am running a DIH that is wiping out part of the index (ie all animals), then re-indexing/re-importing. I have another DIH that is wiping out part if the index (minerals), then

Re: One index or multiple?

2012-10-07 Thread Erick Erickson
My personal approach would be to take DIH out of the mix entirely and do the whole thing in SolrJ where you can exercise control to whatever degree you want. DIH is a fine tool, but sometimes it's wrong for a particular situation. Here's some code to get you started if you want to go that route.

Re: One index or multiple?

2012-10-07 Thread Billy Newman
Erik, Thanks for all the help, what a great community. Unfortunately the 2 data sets I want to use the DIH for change a ton and are changed by a web app accessible to a number of people, as well as a few other internal server applications. Since the data sets were a small figured

Re: One index or multiple?

2012-10-06 Thread Erick Erickson
Sure, you need to define the appropriate delete query for each DIH entry. Best Erick On Fri, Oct 5, 2012 at 5:40 PM, Billy Newman newman...@gmail.com wrote: Does DIH support only deleting/re-indexing docs of a certain type? I.E. can I have a DIH for type:vegetable and another for type:mineral

Re: One index or multiple?

2012-10-06 Thread Walter Underwood
Right. You define three update handlers, something like /update-animal, /update-mineral, and /update-vegetable. Each one has a separate DIH config. Each config deletes documents of that type and loads documents of that type. You will not want to run them at the same time, because a commit in

Re: One index or multiple?

2012-10-05 Thread Erick Erickson
The very first question is what form are your XML docs in? Solr does NOT index arbitrary XML, so I'm guessing you're using DIH and some of the xml stuff there. Do note that the XSLT is a subset of the full capabilities Second, I'd recommend you just put it all in a single index, it'll be

Re: One index or multiple?

2012-10-05 Thread Billy Newman
Erick, I did mention using the DIH to index the first two datasets, that is where my the root of my problem lies. I do see the benefit of one index. However the question still remains, can I use the DIH to index xml from data set 1 and 2, every 15 minutes or so (full index) without wiping out

Re: One index or multiple?

2012-10-05 Thread Erick Erickson
DIH always gives me indigestion. Couple of things: See the 'clean' parameter here for full import: http://wiki.apache.org/solr/DataImportHandler it defaults to true. I think if you set it to false _and_ assuming that your uniqueKey is defined, it should work OK. The other approach would be

Re: One index or multiple?

2012-10-05 Thread Walter Underwood
Using the same unique key doesn't handle documents which disappear from one indexing to the next. Instead, add a field for the type of item, like type:animal, type:vegetable, or type:mineral. Then the query used to clean up before indexing can delete all items of that type. wunder On Oct 5,

Re: One index or multiple?

2012-10-05 Thread Billy Newman
Does DIH support only deleting/re-indexing docs of a certain type? I.E. can I have a DIH for type:vegetable and another for type:mineral and each only deletes/recreates the right types? Thanks. On Fri, Oct 5, 2012 at 1:04 PM, Walter Underwood wun...@wunderwood.org wrote: Using the same unique

Re: All in one index, or multiple indexes?

2009-07-22 Thread Noble Paul നോബിള്‍ नोब्ळ्
keep in mind that everytime a commit is done all the caches are thrown away. If updates for each of these indexes happen at different time then the caches get invalidated each time you commit. so in that case smaller index helps On Wed, Jul 8, 2009 at 4:55 PM, Tim Selltrs...@gmail.com wrote:

Re: All in one index, or multiple indexes?

2009-07-21 Thread Jim Adams
It will depend on how much total volume you have. If you are discussing millions and millions of records, I'd say use multicore and shards. On Wed, Jul 8, 2009 at 5:25 AM, Tim Sell trs...@gmail.com wrote: Hi, I am wondering if it is common to have just one very large index, or multiple

All in one index, or multiple indexes?

2009-07-08 Thread Tim Sell
Hi, I am wondering if it is common to have just one very large index, or multiple smaller indexes specialized for different content types. We currently have multiple smaller indexes, although one of them is much larger then the others. We are considering merging them, to allow the convenience of