Hello everyone!
 
      I've been stuck for about a week trying to figure this out. Been on 
all sorts of blogs and asked questions on stackoverflow in hopes that it 
would lead in the right direction. I would like to know what you think I 
could do.
 
Let's say I have 4 models, a User model, Blog Model, Hashtag and a Comment 
model. Users can create blogs, users can comment on them and blogs can be 
tagged by the blog owner from the comments or the blog description itself. 
If I we're to best describe this, think of Instagram. You can hashtag your 
photo when your uploading it, and afterwards by commenting on it. 
 
Here are the models:
 
#models.py
 
from django.db import models
from django.contrib.auth.models import User
 
class Blog(models.Model):
               user = models.ForeignKey(User)
               description = models.CharField(max_length=140, blank=True)
               blog_text = models.CharField(max_length=5000,  blank=True)
               related_hashtags = models.ManyToManyField('Hashtag', 
related_name = 'tagged_post', null=True, blank=True)
 
 
class Comment(models.Model):
               user = models.ForeignKey(User)
               comment = models.CharField(max_length=140, blank=True)
 
class Hashtag(models.Model):
               user = models.ForeignKey(User)
               comment = models.CharField(max_length=140, blank=True)
               count = models.IntegerField(default=1)
 
Pretty simple. A blog can have multiple hashtags and a hashtag can belong 
to multiple blogs. When a user creates a blog, I check the description 
string for any hashtags using the following function below:
 
 hash_tags = re.findall(r'#(\w+)', blog.description, re.UNICODE)[:30] # 
each blog can have a maximum of 30 hashtags
 if hash_tags:
                for hash_tag in hash_tags:
                               try:
                                              oldHashtag = 
Hashtag.objects.get(hashtag = hash_tag)
                                              oldHashtag.count = 
oldHashtag.count + 1
                                              oldHashtag.save()
                                              
blog.related_hashtag.add(oldHashtag)

                               except Hashtag.DoesNotExist:
                                               newHashtag = 
Hashtag.objects.create(hashtag = lowercase_hash_tag)
                                              
 blog.related_hashtag.add(newHashtag)

I hope so far everything makes sense. I do the same as above whenever a 
Comment object is created.

Now let's imagine the hashtag #acdc is really popular, about 20-40 blogs 
are added to that hashtag about every millisecond.

If the user searches for #acdc, I would like to show the latest 20 blogs 
that have just been added with #acdc, and as the user loads more, I'll show 
the next 20. They are ordered in DESC order of insertion.

I had to resort to RAW SQL since Django couldn't order the queryset by the 
pk of the ManyToManyField, many others suggested using a through table, but 
for legacy reasons, I did it this way:

SELECT * FROM django_blog INNER JOIN django_blog_related_hashtag ON ( 
django_blog.id = django_blog_related_hashtag.blog_id ) INNER JOIN 
django_hashtag ON ( django_blog_related_hashtag.hashtag_id = 
django_hashtag.id ) WHERE ( django_hashtag.hashtag = 'acdc' ) ORDER BY 
django_blog_related_hashtag.id DESC LIMIT 20

If the user wanted to view page two, I'd simply add an OFFSET at the end of 
the query. A simple function like so:

OFFSET = (page_number -1) * 20

This all works great, but for web apps like Instagram or Twitter or this 
Blog app I just made, if the queryset is constantly growing, the user is 
not always in sync with the most recent data. Which is perfectly fine, I 
don't care about that. But what I do care about is that the next page that 
they request must contain the results that are suppose to come after the 
ones they have just received. If I don't account for this, users might see 
blogs that they have already seen from previous pages because of the 
growing queryset. Someone mentioned caching the entire queryset, but 
wouldn't that have a large memory over head?

I'm looking forward to hearing your thoughts and suggestions.

Kind Regards,

Riegie Godwin

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/c9dace38-4cde-4462-98bb-3ff689b14c7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to