Creating a unique id for a schema is one of those design tasks: http://wiki.apache.org/solr/UniqueKey
A marvelously lucid and well-written page, if I do say so. And I do. On Tue, Oct 26, 2010 at 10:16 PM, Tharindu Mathew <mcclou...@gmail.com> wrote: > Really great to know you were able to fire up about 100 cores. But, > when it scales up to around 1000 or even more. I wonder how it would > perform. > > I have a question regarding ids i.e. the unique key. Since there is a > potential use case that two users might add the same document, how > would we set the id. I was thinking of appending the user id to the an > id I would use ex: "/system/bar.pdfuserid25". Otherwise, solr would > replace the document of one user, which is not what we want. > > This is also applicable to deleteById. Is there a better way to do this? > > On Tue, Oct 26, 2010 at 7:45 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote: >> mike anderson wrote: >>> >>> I'm really curious if there is a clever solution to the obvious problem >>> with: "So your better off using a single index and with a user id and use >>> a query filter with the user id when fetching data.", i.e.. when you have >>> hundreds of thousands of user IDs tagged on each article. That just >>> doesn't >>> sound like it scales very well.. >>> >> >> Actually, I think that design would scale pretty fine, I don't think there's >> an 'obvious' problem. You store your userIDs in a multi-valued field (or as >> multiple terms in a single value, ends up being similar). You fq on there >> with the current userID. There's one way to find out of course, but that >> doesn't seem a patently ridiculous scenario or anything, that's the kind of >> thing Solr is generally good at, it's what it's built for. The problem >> might actually be in the time it takes to add such a document to the index; >> but not in query time. >> >> Doesn't mean it's the best solution for your problem though, I can't say. >> >> My impression is that Solr in general isn't really designed to support the >> kind of multi-tenancy use case people are talking about lately. So trying >> to make it work anyway... if multi-cores work for you, then great, but be >> aware they weren't really designed for that (having thousands of cores) and >> may not. If a single index can work for you instead, great, but as you've >> discovered it's not neccesarily obvious how to set up the schema to do what >> you need -- really this applies to Solr in general, unlike an rdbms where >> you just third-form-normalize everything and figure it'll work for almost >> any use case that comes up, in Solr you generally need to custom fit the >> schema for your particular use cases, sometimes being kind of clever to >> figure out the optimal way to do that. >> >> This is, I'd argue/agree, indeed kind of a disadvantage, setting up a Solr >> index takes more intellectual work than setting up an rdbms. The trade off >> is you get speed, and flexible ways to set up relevancy (that still perform >> well). Took a couple decades for rdbms to get as brainless to use as they >> are, maybe in a couple more we'll have figured out ways to make indexing >> engines like solr equally brainless, but not yet -- but it's still pretty >> damn easy for what it is, the lucene/Solr folks have done a remarkable job. >> > > > > -- > Regards, > > Tharindu > -- Lance Norskog goks...@gmail.com