Re: Django ManyToMany performance?

2013-10-29 Thread Tom Evans
On Mon, Oct 28, 2013 at 8:23 PM, Ruturaj Dhekane  wrote:
> Thanks Daniel and Robin.
>
> Have we seen the scales where Many-to-many DBs through Django work well?
> Like 50K articles in 5K publications.

Numbers are really meaningless. One of our legacy products has a
database with 300 million rows in it, it works as well in Django as it
does in C++, because what the application connecting to the database
"is" is largely irrelevant, what matters is what it does to the
database.

> The aim of this question was to make a design choice too - whether I should
> use Django constructs/calls directly or should i write my own SQL to make
> queries - as my schemas are not just many-to-many but a lot more than that.
>
> Thanks for the suggestion on getting the DB schemas reviewed by a DB expert.
> I realized that is going to help me a lot for sure!

Almost anything you can express in a RDBMS DDL can be encapsulated as
a django model - probably the only major outstanding issue is multi
column primary keys. Using django or using direct SQL on the same
schema will produce identical results.

Besides which, django does not restrict you from using SQL, which is
sometimes necessary - whilst django covers most of SQL as DDL, there
are many many DML SQL queries that you cannot express in django's ORM.

(DDL = Data Definition Language, DML = Data Manipulation Language, in
case any of those terms are unclear)

Cheers

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CAFHbX1Lr9CMCgoLrfHS%3DX-WKrbEOMa0640nWwNi_yJ5E_iMkRw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Django ManyToMany performance?

2013-10-28 Thread Ruturaj Dhekane
Thanks Daniel and Robin.

Have we seen the scales where Many-to-many DBs through Django work well?
Like 50K articles in 5K publications.
The aim of this question was to make a design choice too - whether I should
use Django constructs/calls directly or should i write my own SQL to make
queries - as my schemas are not just many-to-many but a lot more than that.

Thanks for the suggestion on getting the DB schemas reviewed by a DB
expert. I realized that is going to help me a lot for sure!


On Mon, Oct 28, 2013 at 1:40 AM, Robin St.Clair  wrote:

> First of all , any DBMS that is misused by design or in operation will
> give poor performance.. There is an excellent mature database available for
> Django developers, PostgreSQL. It works fine on the same machine as you are
> developing on.
>
> MongoDB is NOT a suitable DB for general purposes. It is fine for
> documents, less good for being updated, particularly as it locks tables
> when updating or inserting.
>
> Make sure your generated database is checked by somebody who knows about
> databases.
>
> R+C
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-users+unsubscr...@googlegroups.com.
> To post to this group, send email to django-users@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/6007019b-b007-4d46-adab-4dcbf16b0e7d%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CAEGxL4nfAUJJ210T7XCL%2BHOOSijr0wY9Ya3p%2BGNLBaU%3DfDDSDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Django ManyToMany performance?

2013-10-27 Thread Robin St.Clair
First of all , any DBMS that is misused by design or in operation will give 
poor performance.. There is an excellent mature database available for 
Django developers, PostgreSQL. It works fine on the same machine as you are 
developing on.

MongoDB is NOT a suitable DB for general purposes. It is fine for 
documents, less good for being updated, particularly as it locks tables 
when updating or inserting.

Make sure your generated database is checked by somebody who knows about 
databases.

R+C

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/6007019b-b007-4d46-adab-4dcbf16b0e7d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Django ManyToMany performance?

2013-10-27 Thread Daniel Roseman
On Sunday, 27 October 2013 00:46:17 UTC+1, Ruturaj Dhekane wrote:

> Hi all,
>
> I have a particular datastructure where there are two objects
> 1. A document - and a lot of its properties - like content, version, 
> timestamp etc.
> 2. Contributors - basically people represented by unique IDs
>
> A document can have many contributors and a contributor can author many 
> documents. A typical manyTomany scenario
>
> Is there a design where I can find how ManyToMany is implemented (I know I 
> can go through the code - but existing design document/wiki/blog might 
> help).
>
> Also what are the performance implications? 
> How many SQL queries does it really make per request on django? 
> Can it scale to say 1million documents and a few thousand users?
> What is a good SQL backend I can use? MYSQL or SQLite?
> Has anyone tried with MondoDB etc for such relationships?
> Any examples where you have used ManyToMany and scaled it up for many 
> entries?
>
> Thank you for your answers. Would give me a lot of confidence in going 
> ahead with choosing M2M as a field. The whole project might hinge on it!
>
> Ruturaj
>
>
This question is a bit too vague to answer. A many-to-many relationship in 
a Django database is implemented just as it would be in any other SQL db - 
with an intermediate linking table containing foreign keys to both sides of 
the relationship - and queried via JOINs between the three tables. The only 
difference is that the linking table isn't exposed unless you specifically 
ask it to be.

There's nothing specifically efficient or inefficient about the m2m field: 
it just is what it is. If it fits your data model - which, given your 
description, I'd say it does - you should use it.
--
DR.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/bfaef3b6-6fb6-473a-80d2-b1730884ebfd%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Django ManyToMany performance?

2013-10-26 Thread Ruturaj Dhekane
Hi all,

I have a particular datastructure where there are two objects
1. A document - and a lot of its properties - like content, version,
timestamp etc.
2. Contributors - basically people represented by unique IDs

A document can have many contributors and a contributor can author many
documents. A typical manyTomany scenario

Is there a design where I can find how ManyToMany is implemented (I know I
can go through the code - but existing design document/wiki/blog might
help).

Also what are the performance implications?
How many SQL queries does it really make per request on django?
Can it scale to say 1million documents and a few thousand users?
What is a good SQL backend I can use? MYSQL or SQLite?
Has anyone tried with MondoDB etc for such relationships?
Any examples where you have used ManyToMany and scaled it up for many
entries?

Thank you for your answers. Would give me a lot of confidence in going
ahead with choosing M2M as a field. The whole project might hinge on it!

Ruturaj

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CAEGxL4kVr1gNKfGJz_wj%3Dx3kJqr2C%3Dpnu435nkML07jKTdbaLQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ManyToMany Performance

2010-01-30 Thread Russell Keith-Magee
On Sun, Jan 31, 2010 at 1:38 AM, chefsmart  wrote:
> Let's say I have two models -  Article and Publication. Article has a
> field
>
> publications = models.ManyToManyField(Publication)
>
> Let's say I present the user with a series of checkboxes representing
> publications (much like the ModelMultipleChoiceField, but I am not
> using ModelForms here) and getting back selected publications with
>
> selectedPubs = map(int, request.POST.getlist('publications')) #getting
> the selected primary keys -- integers
>
> Then if I do
>
> for x in selectedPubs :
>    article1.publications.add(x)
>
> This seems not a good enough solution as far as performance and
> database interaction is concerned.
>
> What is a better way to handle a situation like this?

article1.publications.add(*selectedPubs)

will allow you to insert all the entries in selectedPubs into the m2m
relation in a single call. This doesn't quite reduce to 1 SQL call,
but it will be less SQL calls than multiple individual insertions.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



ManyToMany Performance

2010-01-30 Thread chefsmart
Let's say I have two models -  Article and Publication. Article has a
field

publications = models.ManyToManyField(Publication)

Let's say I present the user with a series of checkboxes representing
publications (much like the ModelMultipleChoiceField, but I am not
using ModelForms here) and getting back selected publications with

selectedPubs = map(int, request.POST.getlist('publications')) #getting
the selected primary keys -- integers

Then if I do

for x in selectedPubs :
article1.publications.add(x)

This seems not a good enough solution as far as performance and
database interaction is concerned.

What is a better way to handle a situation like this?

Regards.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: ManyToMany performance

2008-10-01 Thread [EMAIL PROTECTED]

What you could do is create the intermediary table explicitly using
the through keyword: 
http://docs.djangoproject.com/en/dev//topics/db/models/#extra-fields-on-many-to-many-relationships
then you can do queries on the table.

Alex

On Oct 1, 7:17 pm, "Dan W." <[EMAIL PROTECTED]> wrote:
> This is my first time building an app with django and so far I've been
> more than happy with it. However, I can't seem to figure out how to
> sufficiently optimize queries for ManyToMany relationships. Here is a
> model to illustrate my problem:
>
> class Book(models.Model):
>      name = models.CharField(max_length=255)
>
> class Reader(models.Model):
>      first_name = models.CharField(max_length=30)
>      last_name = models.CharField(max_length=30)
>      books_read = models.ManyToMany(Book, relatedName='read_by')
>
> Now I simply want to list every reader along with the books that
> they've read. For example:
>
> Reader                Books Read
> --
> --
> Claire Smith        Great Expectations
>                           Jane Eyre
>                           Wuthering Heights
> Dan Smith           Great Expectations
>                           War and Peace
>                           Adventures of Sherlock Holmes
> Bob Smith           Adventures of Sherlock Holmes
>                           The Call of the Wild
>                           War and Peace
>                       
>
> Ideally select_related() would work with ManyToMany relationships as
> well as ForeignKey relationships but I can see that this would take
> the model query API to a whole other level of complixity so I can
> understand why it hasn't been done yet. I'm used to ORM libraries like
> Java's Hibernate that can fetch entire graphs of objects with a
> minimum number of queries but I'm more than happy to do a little extra
> work to avoid the complexity of something like Hibernate.
>
> However, I would hope that I could fetch the necessary data to
> instantiate the object graph for any number of readers with only 2
> queries. The queries would look like:
>     readers = Reader.objects.filter(last_name='Smith');
>     books = Book.objects.filter(read_by__last_name='Smith');
>
> These two queries get me all the data I need but I still don't have
> the connectivity information to know which readers have read which
> books. In order to get this piece of info, it looks like I'll have to
> drop into the native sql world and fetch the info directly from the
> many-to-many table created by django. Is this what necessary to do the
> database optimization or am I missing something? It's not so bad in
> this simple case but as the graph of interesting objects gets bigger,
> I'm afraid I may have to write a lot of this type of code.
>
> Thanks,
>
> --Dan
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



ManyToMany performance

2008-10-01 Thread Dan W.

This is my first time building an app with django and so far I've been
more than happy with it. However, I can't seem to figure out how to
sufficiently optimize queries for ManyToMany relationships. Here is a
model to illustrate my problem:

class Book(models.Model):
 name = models.CharField(max_length=255)

class Reader(models.Model):
 first_name = models.CharField(max_length=30)
 last_name = models.CharField(max_length=30)
 books_read = models.ManyToMany(Book, relatedName='read_by')


Now I simply want to list every reader along with the books that
they've read. For example:

ReaderBooks Read
--
--
Claire SmithGreat Expectations
  Jane Eyre
  Wuthering Heights
Dan Smith   Great Expectations
  War and Peace
  Adventures of Sherlock Holmes
Bob Smith   Adventures of Sherlock Holmes
  The Call of the Wild
  War and Peace
  


Ideally select_related() would work with ManyToMany relationships as
well as ForeignKey relationships but I can see that this would take
the model query API to a whole other level of complixity so I can
understand why it hasn't been done yet. I'm used to ORM libraries like
Java's Hibernate that can fetch entire graphs of objects with a
minimum number of queries but I'm more than happy to do a little extra
work to avoid the complexity of something like Hibernate.

However, I would hope that I could fetch the necessary data to
instantiate the object graph for any number of readers with only 2
queries. The queries would look like:
readers = Reader.objects.filter(last_name='Smith');
books = Book.objects.filter(read_by__last_name='Smith');

These two queries get me all the data I need but I still don't have
the connectivity information to know which readers have read which
books. In order to get this piece of info, it looks like I'll have to
drop into the native sql world and fetch the info directly from the
many-to-many table created by django. Is this what necessary to do the
database optimization or am I missing something? It's not so bad in
this simple case but as the graph of interesting objects gets bigger,
I'm afraid I may have to write a lot of this type of code.

Thanks,

--Dan

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-04 Thread krylatij

sorry, you need

story = Story.objects.filter(pk=1).extra(select={'main_section':
sub_select }).values('main_section', 'other_field1', 'other_field2' )
[0]

because we get here list of dictionaries
Good luck!


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-04 Thread krylatij

You can use an extra select to get additional parameter 'main_section'
This is for MySQL and it's only sample
you need to test it in your DB to make sure, that it works

sub_select =  """SELECT section_name
FROM %(table_section)s as sc, %
(table_section_story)s  as st
WHERE  st.story_id  = %(table_story)s.id
AND sc.id =
st.section_id
ORDER BY order
LIMIT 1""" % { 'table_section':
Section._meta.db_table,
  'table_section_story':
'myapp_section_story'  # Not sure!!!  see your DB actual table name
  'table_story':
Story._meta.db_table }

We don't create Story object here, we just get a dictionary with all
values we need in template

story = Story.objects.filter(pk=1).extra(select={'main_section':
sub_select }).values('main_section', 'other_field1', 'other_field2' )

In template:
{{ story.main_section }}
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-04 Thread [EMAIL PROTECTED]

I modified the 'get_main_section' method in this way..

def get_main_section(self):
try:
sections = list(self.id_section.all().order_by('order'))
if (len(sections) > 1):
section_name = sections[1]
else:
section_name = sections[0]
return section_name

But my question is:
is possibile to populate the id_section field in my Story object just
one time?
if in the same template i write:

{{ story.get_main_section }}
{{ story.get_main_section }}
{{ story.get_main_section }}

django make 3 query to mysql...
I tried with select_related, but without success.

Thanks

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-03 Thread krylatij

I still think you need second query.
as i understand from your code get_main_section will return list of
Sections in Story?
I mean this string:
(section_name = self.id_section.all().order_by('order') )

bad:
 if (len(self.id_section.all()) > 1):#hits DB
 section_name = self.id_section.all().order_by('order') #hits
DB

better:
sections = self.id_section.all()
if (len(sections) > 1):
 



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-03 Thread [EMAIL PROTECTED]

I was not enough precise..
'Story' object has a method get_main_section...

def get_main_section(self):
try:
if (len(self.id_section.all()) > 1):
section_name = self.id_section.all().order_by('order')
[1]
else:
section_name = self.id_section.all()[0]
return section_name

in my template I have : {{ story.get_main_section }}

..sorry..
Davide


On 3 Set, 16:36, krylatij <[EMAIL PROTECTED]> wrote:
> it seems you need extra select anyway
> or use .extra(...) to add additional args to db query
> (http://www.djangoproject.com/documentation/db-api/#extra-select-none-
> where-none-params-none-tables-none-order-by-none-select-params-none)
>
> But i don't understand. Imagine we have Story with 3 Sections
> And in template we access this sections via
> {{ my_simple_story.id_section.section_name }}
> Which section_name (we have 3 Sections) you want to display in this
> template?
>
> It seem's you need another syntax like:
> {% for section in my_simple_story.id_section %}
>      {{ section.section_name}} 
> {% endfor %}
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-03 Thread krylatij

it seems you need extra select anyway
or use .extra(...) to add additional args to db query
(http://www.djangoproject.com/documentation/db-api/#extra-select-none-
where-none-params-none-tables-none-order-by-none-select-params-none)

But i don't understand. Imagine we have Story with 3 Sections
And in template we access this sections via
{{ my_simple_story.id_section.section_name }}
Which section_name (we have 3 Sections) you want to display in this
template?

It seem's you need another syntax like:
{% for section in my_simple_story.id_section %}
 {{ section.section_name}} 
{% endfor %}

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-03 Thread krylatij

Solution #1:
   use select_related() method when you get your Story from db:
   Story.objects.get(pk=3).select_related()

Solution #2:
   add to your Section class definition method:
   def __unicode__(self):
 return self.section_name

   and change your template to:
  {{ story.id_section }}



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: ManyToMany performance..

2008-09-03 Thread [EMAIL PROTECTED]

I tried with select_related, but it read only foreignKey and not
ManyToMany relations..
The section object already has __unicode__ method..

def __unicode__(self):
return u'%s - %s' % (self.id_site.domain, self.section_name)

Thanks
Davide

On 3 Set, 15:40, krylatij <[EMAIL PROTECTED]> wrote:
> Solution #1:
>    use select_related() method when you get your Story from db:
>    Story.objects.get(pk=3).select_related()
>
> Solution #2:
>    add to your Section class definition method:
>    def __unicode__(self):
>          return self.section_name
>
>    and change your template to:
>           {{ story.id_section }}
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



ManyToMany performance..

2008-09-03 Thread [EMAIL PROTECTED]

Hi All..

I've a performance problem with manyToMany relations..
I've a model like this:


class Story(models.Model):
title = models.TextField(verbose_name =
"titolo" ,max_length=200,core=True)
abstract = models.TextField('sommario',max_length=400,core=True)
text = models.TextField(verbose_name = "testo",core=True)
id_section =
models.ManyToManyField(Section,verbose_name="Sezione", null = False)
...

and the Section model is:

class Section(models.Model):
id_site = models.ForeignKey(Site,verbose_name="Sito")
section_name = models.CharField(max_length=100,core=True,
verbose_name="Nome")
order = models.IntegerField(default=0, verbose_name="Ordine")
section_father = models.ForeignKey('self',blank=True,null=True,
verbose_name="Sezione superiore")


In my template, every time I make:

{{ story.id_section.section_name }}

it seems that django make a new query in the 'section' table...
Is there a way to populate my Story object with the section values
just one time?
In the mysql log file I have a lot of identycal query...


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---