Hi Waldemar, Alex,

why you didn't do different threads for the different issues? :\

Regarding getting .filter() to work, I suggest we will use explicit
and implicit indexes, something like this:

class User(models.Model):
   username = models.CharField(max_length=200, db_index=True) #
db_index=True should add the third line implicitly
   email = models.CharField(max_length=200) # db_index=True can help,
but we have to go explicit here.
   email = models.CharField(max_length=200, unique=True,
db_index='lowercase') # further ideas on lowercase support
   #username_index = models.Index(['username'], 'pk')  # line 3
   email_index = models.Index(['email'], 'pk',
filter={'email':'lowercase'}) # line 4
   ...

And user will write:
users = User.objects.filter(email_index__startswith='me@')

Or:

class UsernameIndex(models.Index):
    model = User
    keys = ['category', 'email']
    values = ['id']

    def clean_email(value):
        return value.lower()

and the user writing:
users = UsernameIndex.objects.filter(email__startswith='me@', category='staff')

But I don't think it has related to GSoC at all.
And this can be made outside of Django.

The same for Counters:

class User(models.Model):
   username = models.CharField(max_length=200, db_index=True)
   email = models.CharField(max_length=200)
   category = models.CharField(max_length=20, choices=(('S', 'Staff'),
('U', 'User'))
   counter = aggregates.Counter('category') # count number of users by category

I think, we should go with a separate Django proposal(s), don't put
that on Alex's shoulders.

What Alex will need to provide, is a way to assign a hook for
.filter() from QuerySet, which probably should be put into some kind
of NosqlManager (which will replace Manager at User.objects).

On Thu, Apr 8, 2010 at 8:08 PM, Waldemar Kornewald <wkornew...@gmail.com> wrote:
> On Thu, Apr 8, 2010 at 6:14 PM, Alex Gaynor <alex.gay...@gmail.com> wrote:
>> On Wed, Apr 7, 2010 at 4:43 PM, Waldemar Kornewald <wkornew...@gmail.com> 
>> wrote:
>>> On Wed, Apr 7, 2010 at 5:12 PM, Alex Gaynor <alex.gay...@gmail.com> wrote:
>>>> No.  I am vehemently opposed to attempting to extensively emulate the
>>>> features of a relational database in a non-relational one.  People
>>>> talk about the "object relational" impedance mismatch, much less the
>>>> "object-relational non-relational" one.  I have no interest in
>>>> attempting to support any attempts at emulating features that just
>>>> don't exist on the databases they're being emulated on.
>>>
>>> This decision has to be based on the actual needs of NoSQL developers.
>>> Did you actually work on non-trivial projects that needed
>>> denormalization and in-memory JOINs and manually maintained counters?
>>> I'm not making this up. The "dumb" key-value store API is not enough.
>>> People are manually writing lots of code for features that could be
>>> handled by an SQL emulation layer. Do we agree until here?
>>>
>>
>> No, we don't.  People are desiging there data in ways that fit their
>> datastore. If all people did was implement a relational model in
>> userland code on top of non-relational databases then they'd really be
>> missing the point.
>
> Then you're calling everyone a fool. :) What do you call a CouchDB or
> Cassandra index mapping usernames to user pks? Its purpose it exactly
> to do something that relational DBs provides out-of-the-box. You can't
> deny that people do in fact manually maintain such indexes.
>
> So, you're suggestion to write code like this:
>
> # ----------
> class User(models.Model):
>    username = models.CharField(max_length=200)
>    email = models.CharField(max_length=200)
>    ...
>
> class UsernameUser(models.Model):
>    username = models.CharField(primary_key=True, max_length=200)
>    user_id = models.IntegerField()
>
> class EmailUser(models.Model):
>    email = models.CharField(primary_key=True, max_length=200)
>    user_id = models.IntegerField()
>
> def add_user(username, email):
>    user = User.objects.create(username=username, email=email)
>    UsernameUser.objects.create(username=username, user_id=user.id)
>    EmailUser.objects.create(email=email, user_id=user.id)
>    return user
>
> def get_user_by_username(username):
>    id = UsernameUser.objects.get(username=username).user_id
>    return User.objects.get(id=id)
>
> def get_user_by_email(email):
>    id = EmailUser.objects.get(email=email).user_id
>    return User.objects.get(id=id)
>
> get_user_by_username('marcus')
> get_user_by_email('mar...@marcus.com')
> # ----------
>
> What I'm proposing allows you to just write this:
>
> # ----------
> class User(models.Model):
>    username = models.CharField(max_length=200)
>    email = models.CharField(max_length=200)
>    ...
>
> User.objects.get(username='marcus')
> User.objects.get(email='mar...@marcus.com')
> # ----------
>
> Are you seriously saying that people should use the first version of
> the code when they work with a simplistic NoSQL DB (note, it's how
> they work today with those DBs)?
>
>>> * Django apps written for NoSQL will be portable across all NoSQL DBs
>>> without any code changes and in the worst case require only minor
>>> changes to switch to SQL
>>> * the resulting code is shorter and easier to understand than with a
>>> separate API which would only add another layer of indirection you'd
>>> have to think about *every* (!) single time you work with models (and
>>> if you have to think about this while writing model code you end up
>>> with potentially a lot more bugs, as is actually the case in practice)
>>> * developers won't have to use and learn a different models API (you'd
>>> only need to learn an API for specifying "optimization" rules, but the
>>> models would still be the same)
>>>
>>
>> Uhh, the whole point of htis is that there is only a single API.
>
> And what you're suggesting is an API whose semantics are different on
> every single backend? How is that better? The indexing API would at
> least look and behave the same on all backends, so it's a "learn once
> and use anywhere" experience.
>
>>> What if you filter on one field defined in the parent class and
>>> another field defined on the child class? Emulating this query would
>>> be either very inefficient and (for large datasets) possibly return no
>>> results, at all, or require denormalization which I'd find funny in
>>> the case of MTI because it brings us back to single-table inheritance,
>>> but it might be the only solution that works efficiently on all NoSQL
>>> DBs.
>>>
>>
>> Filters on base fields can be implemented fairly easily on databases
>> with IN queries.  Otherwise I suppose it raises an exception.
>
> How would that be implemented with an IN filter (you have two
> different tables)? What would the (pseudo-)code look like?
>
> Bye,
> Waldemar Kornewald
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To post to this group, send email to django-develop...@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-developers+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-developers?hl=en.
>
>



-- 
Best regards, Yuri V. Baburov, ICQ# 99934676, Skype: yuri.baburov,
MSN: bu...@live.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to