Creating a unique id for a schema is one of those design tasks:

http://wiki.apache.org/solr/UniqueKey

A marvelously lucid and well-written page, if I do say so. And I do.

On Tue, Oct 26, 2010 at 10:16 PM, Tharindu Mathew <mcclou...@gmail.com> wrote:
> Really great to know you were able to fire up about 100 cores. But,
> when it scales up to around 1000 or even more. I wonder how it would
> perform.
>
> I have a question regarding ids i.e. the unique key. Since there is a
> potential use case that two users might add the same document, how
> would we set the id. I was thinking of appending the user id to the an
> id I would use ex: "/system/bar.pdfuserid25". Otherwise, solr would
> replace the document of one user, which is not what we want.
>
> This is also applicable to deleteById. Is there a better way to do this?
>
> On Tue, Oct 26, 2010 at 7:45 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote:
>> mike anderson wrote:
>>>
>>> I'm really curious if there is a clever solution to the obvious problem
>>> with: "So your better off using a single index and with a user id and use
>>> a query filter with the user id when fetching data.", i.e.. when you have
>>> hundreds of thousands of user IDs tagged on each article. That just
>>> doesn't
>>> sound like it scales very well..
>>>
>>
>> Actually, I think that design would scale pretty fine, I don't think there's
>> an 'obvious' problem. You store your userIDs in a multi-valued field (or as
>> multiple terms in a single value, ends up being similar). You fq on there
>> with the current userID.   There's one way to find out of course, but that
>> doesn't seem a patently ridiculous scenario or anything, that's the kind of
>> thing Solr is generally good at, it's what it's built for.   The problem
>> might actually be in the time it takes to add such a document to the index;
>> but not in query time.
>>
>> Doesn't mean it's the best solution for your problem though, I can't say.
>>
>> My impression is that Solr in general isn't really designed to support the
>> kind of multi-tenancy use case people are talking about lately.  So trying
>> to make it work anyway... if multi-cores work for you, then great, but be
>> aware they weren't really designed for that (having thousands of cores) and
>> may not. If a single index can work for you instead, great, but as you've
>> discovered it's not neccesarily obvious how to set up the schema to do what
>> you need -- really this applies to Solr in general, unlike an rdbms where
>> you just third-form-normalize everything and figure it'll work for almost
>> any use case that comes up,  in Solr you generally need to custom fit the
>> schema for your particular use cases, sometimes being kind of clever to
>> figure out the optimal way to do that.
>>
>> This is, I'd argue/agree, indeed kind of a disadvantage, setting up a Solr
>> index takes more intellectual work than setting up an rdbms. The trade off
>> is you get speed, and flexible ways to set up relevancy (that still perform
>> well). Took a couple decades for rdbms to get as brainless to use as they
>> are, maybe in a couple more we'll have figured out ways to make indexing
>> engines like solr equally brainless, but not yet -- but it's still pretty
>> damn easy for what it is, the lucene/Solr folks have done a remarkable job.
>>
>
>
>
> --
> Regards,
>
> Tharindu
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to