Re: Equivalent of multi-table JOIN (another post on reverse select_related)
On Monday, November 6, 2017 at 9:36:11 PM UTC+1, Matthew Pava wrote: > > Is it really that bad? Maybe I’m missing something in your situation. > Ooooh, it isn't really. I incorrectly assumed that the query would perform like having an implicit DISTINCT(device.id). But it does in fact return absolutely the right thing: >>> q = Device.objects.filter(pk='localhost', package__name__contains='i', unit__ip__address__contains='1') >>> q.count() 1257 Yeah, that solves the issue. Thanks a lot! -Samuel -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/7b0fbf85-5057-4e06-ab8c-6f16092b7fa3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
RE: Equivalent of multi-table JOIN (another post on reverse select_related)
Is it really that bad? Maybe I’m missing something in your situation. I use my own custom page_queryset function. I never got around to looking at the built-in Django way of doing it. I think there is a generic view that can do paging. q = Device.objects.filter(hostname__contains= 'localhost ', package__name__contains= 'unix ', interface__IP__address__contains= '192') total_count = q.count() def page_queryset(qs, page, count_per_page): """ :param qs: the queryset or list to slice :param page: the current page to get records from (1-based) :param count_per_page: how many items are part of each page :return: a tuple: the index of the last page, the sliced queryset """ slice_begin = (page - 1) * count_per_page slice_end = slice_begin + count_per_page if type(qs) == QuerySet: max_count = qs.count() else: max_count = len(qs) slice_end = slice_end if slice_end < max_count else max_count last_page = max_count // count_per_page + 1 return last_page, qs[slice_begin:slice_end] From: django-users@googlegroups.com [mailto:django-users@googlegroups.com] On Behalf Of Samuel Abels Sent: Monday, November 6, 2017 2:22 PM To: Django users Subject: Re: Equivalent of multi-table JOIN (another post on reverse select_related) On Monday, November 6, 2017 at 9:15:09 PM UTC+1, Matthew Pava wrote: Maybe you are expecting too much from the user interface. Shouldn’t you at least request from the user what primary object you are looking for? The primary model is always that one that is closest to "device"; step 2 of the process already takes this into account and returns the path in the best order, with the primary model being returned first. Ultimately, your Django ORM code would look like this for your example, if I follow your arrows correctly: Device.objects.filter(hostname__contains= 'localhost ', package__name__contains= 'unix ', interface__IP__address__contains= '192') This would perform the right query, but would not provide me with a total. It would also be impossible to do paging, because slicing the result does not take into account that the LEFT JOIN multiplies the number of total rows. -Samuel -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com<mailto:django-users+unsubscr...@googlegroups.com>. To post to this group, send email to django-users@googlegroups.com<mailto:django-users@googlegroups.com>. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/f1c92f41-ce59-47c5-9a82-6af38467a536%40googlegroups.com<https://groups.google.com/d/msgid/django-users/f1c92f41-ce59-47c5-9a82-6af38467a536%40googlegroups.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/0220e4a2f3c84809a51c952f60b8b7f9%40ISS1.ISS.LOCAL. For more options, visit https://groups.google.com/d/optout.
Re: Equivalent of multi-table JOIN (another post on reverse select_related)
On Monday, November 6, 2017 at 9:15:09 PM UTC+1, Matthew Pava wrote: > > Maybe you are expecting too much from the user interface. Shouldn’t you > at least request from the user what primary object you are looking for? > The primary model is always that one that is closest to "device"; step 2 of the process already takes this into account and returns the path in the best order, with the primary model being returned first. > Ultimately, your Django ORM code would look like this for your example, if > I follow your arrows correctly: > > Device.objects.filter(hostname__contains= 'localhost ', > package__name__contains= 'unix ', interface__IP__address__contains= '192') > This would perform the right query, but would not provide me with a total. It would also be impossible to do paging, because slicing the result does not take into account that the LEFT JOIN multiplies the number of total rows. -Samuel > -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/f1c92f41-ce59-47c5-9a82-6af38467a536%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
RE: Equivalent of multi-table JOIN (another post on reverse select_related)
Maybe you are expecting too much from the user interface. Shouldn’t you at least request from the user what primary object you are looking for? Explicit is better than implicit. Your example indicates that your primary object is a Device, but your UI dict gives no indication whatsoever that that’s what you want. If you want to look for related fields in a model, check out the Model _meta API: https://docs.djangoproject.com/en/1.11/ref/models/meta/ Ultimately, your Django ORM code would look like this for your example, if I follow your arrows correctly: Device.objects.filter(hostname__contains= 'localhost ', package__name__contains= 'unix ', interface__IP__address__contains= '192') It would seem that you want your UI to pass in the primary model and with all other models how they are related to the primary model. From: django-users@googlegroups.com [mailto:django-users@googlegroups.com] On Behalf Of Samuel Abels Sent: Monday, November 6, 2017 1:59 PM To: Django users Subject: Re: Equivalent of multi-table JOIN (another post on reverse select_related) Thanks, I have seen that and plan to use it, but for this particular feature I need something more tailored. -Samuel On Monday, November 6, 2017 at 8:56:29 PM UTC+1, Matthew Pava wrote: Though it doesn’t directly answer your query, you might be interested in this package: https://github.com/burke-software/django-report-builder From: django...@googlegroups.com [mailto:django...@googlegroups.com] On Behalf Of Samuel Abels Sent: Monday, November 6, 2017 1:33 PM To: Django users Subject: Equivalent of multi-table JOIN (another post on reverse select_related) I am working on a reporting feature that allows users for querying arbitrary models and fields, and present the result as a table. For example, consider the following object model: Package | v Device <- Component ^ ^ | | | Interface <---2--- Connection ^^^ / | \ / | \ / | \ Sampling IP Policy (The dash is the direction of a ForeignKey.) To produce a report, I chose a three-step process: 1. The user interfaces returns a list of fields to be included in the report, such as args = dict('Device.hostname__contains': 'localhost', 'Package.name__icontains': 'unix', 'IP.address__contains': '192') 2. Given the list of args, find the shortest path that connects all required models. For the example above, the result is a tuple: path = Device, Package, Interface, IP 3. In theory, I could now perform the following SQL request: SELECT * FROM myapp_device d LEFT JOIN myapp_package pa ON pa.device_id=d.id<http://d.id> LEFT JOIN myapp_interface ifc ON ifc.device_id=d.id<http://d.id> LEFT JOIN myapp_ip ip ON ip.interface_id=ifc.id<http://ifc.id>; But of course, I want to avoid the raw SQL. I considered the following options: - Using Device.objects.select_related() does not work, because Device has a 1:n relation to Package (and also to Unit), which Django's select_related() does not support. - Using prefetch_related() does not work, because it prefetches everything, which is too much in our case (>100 million rows if a user queries on all tables), and it does not provide us with a total of the number of rows selected. In practice, I want to count(*) everything for displaying a total, and fetch only a subset, using LIMIT. Our tests showed that the raw SQL query with LEFT JOIN is fast enough for production, regardless of what fields and objects are being queried. The craziest query I built took about 20 seconds, which is ok for what we are trying to do. Any other options? -Samuel -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com. To post to this group, send email to djang...@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com<https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com<mailto:django-users+unsubscr...@googlegroups.com>. To post to this group, send email to django-users@googlegroups.com<mailto:django-users@googlegroups.com>. Visit this group at https://groups.google.com/group/django-users. To view this discussion on
Re: Equivalent of multi-table JOIN (another post on reverse select_related)
Thanks, I have seen that and plan to use it, but for this particular feature I need something more tailored. -Samuel On Monday, November 6, 2017 at 8:56:29 PM UTC+1, Matthew Pava wrote: > > Though it doesn’t directly answer your query, you might be interested in > this package: > > https://github.com/burke-software/django-report-builder > > > > > > *From:* django...@googlegroups.com [mailto: > django...@googlegroups.com ] *On Behalf Of *Samuel Abels > *Sent:* Monday, November 6, 2017 1:33 PM > *To:* Django users > *Subject:* Equivalent of multi-table JOIN (another post on reverse > select_related) > > > > I am working on a reporting feature that allows users for querying > arbitrary models and fields, and present the result as a table. For > example, consider the following object model: > > > > Package > >| > >v > > Device <- Component > >^ ^ > >| | > >| Interface <---2--- Connection > > ^^^ > >/ | \ > > / | \ > > / | \ > > Sampling IP Policy > > > > (The dash is the direction of a ForeignKey.) > > To produce a report, I chose a three-step process: > > > > 1. The user interfaces returns a list of fields to be included in the > report, such as > > > > args = dict('Device.hostname__contains': 'localhost', > 'Package.name__icontains': 'unix', 'IP.address__contains': '192') > > > > 2. Given the list of args, find the shortest path that connects all > required models. For the example above, the result is a tuple: > > > > path = Device, Package, Interface, IP > > > > 3. In theory, I could now perform the following SQL request: > > > > SELECT * FROM myapp_device d > > LEFT JOIN myapp_package pa ON pa.device_id=d.id > > LEFT JOIN myapp_interface ifc ON ifc.device_id=d.id > > LEFT JOIN myapp_ip ip ON ip.interface_id=ifc.id; > > > > But of course, I want to avoid the raw SQL. I considered the following > options: > > > > - Using Device.objects.select_related() does not work, because Device has > a 1:n relation to Package (and also to Unit), which Django's > select_related() does not support. > > > > - Using prefetch_related() does not work, because it prefetches > everything, which is too much in our case (>100 million rows if a user > queries on all tables), and it does not provide us with a total of the > number of rows selected. In practice, I want to count(*) everything for > displaying a total, and fetch only a subset, using LIMIT. > > > > Our tests showed that the raw SQL query with LEFT JOIN is fast enough for > production, regardless of what fields and objects are being queried. The > craziest query I built took about 20 seconds, which is ok for what we are > trying to do. > > > > Any other options? > > > > -Samuel > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to django-users...@googlegroups.com . > To post to this group, send email to djang...@googlegroups.com > . > Visit this group at https://groups.google.com/group/django-users. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com > > <https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/f692aa3a-cc23-4d78-8e3f-16ed30097038%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
RE: Equivalent of multi-table JOIN (another post on reverse select_related)
Though it doesn’t directly answer your query, you might be interested in this package: https://github.com/burke-software/django-report-builder From: django-users@googlegroups.com [mailto:django-users@googlegroups.com] On Behalf Of Samuel Abels Sent: Monday, November 6, 2017 1:33 PM To: Django users Subject: Equivalent of multi-table JOIN (another post on reverse select_related) I am working on a reporting feature that allows users for querying arbitrary models and fields, and present the result as a table. For example, consider the following object model: Package | v Device <- Component ^ ^ | | | Interface <---2--- Connection ^^^ / | \ / | \ / | \ Sampling IP Policy (The dash is the direction of a ForeignKey.) To produce a report, I chose a three-step process: 1. The user interfaces returns a list of fields to be included in the report, such as args = dict('Device.hostname__contains': 'localhost', 'Package.name__icontains': 'unix', 'IP.address__contains': '192') 2. Given the list of args, find the shortest path that connects all required models. For the example above, the result is a tuple: path = Device, Package, Interface, IP 3. In theory, I could now perform the following SQL request: SELECT * FROM myapp_device d LEFT JOIN myapp_package pa ON pa.device_id=d.id LEFT JOIN myapp_interface ifc ON ifc.device_id=d.id LEFT JOIN myapp_ip ip ON ip.interface_id=ifc.id; But of course, I want to avoid the raw SQL. I considered the following options: - Using Device.objects.select_related() does not work, because Device has a 1:n relation to Package (and also to Unit), which Django's select_related() does not support. - Using prefetch_related() does not work, because it prefetches everything, which is too much in our case (>100 million rows if a user queries on all tables), and it does not provide us with a total of the number of rows selected. In practice, I want to count(*) everything for displaying a total, and fetch only a subset, using LIMIT. Our tests showed that the raw SQL query with LEFT JOIN is fast enough for production, regardless of what fields and objects are being queried. The craziest query I built took about 20 seconds, which is ok for what we are trying to do. Any other options? -Samuel -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com<mailto:django-users+unsubscr...@googlegroups.com>. To post to this group, send email to django-users@googlegroups.com<mailto:django-users@googlegroups.com>. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com<https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/296e72112fa946b6b3f13cf52b96303d%40ISS1.ISS.LOCAL. For more options, visit https://groups.google.com/d/optout.
Equivalent of multi-table JOIN (another post on reverse select_related)
I am working on a reporting feature that allows users for querying arbitrary models and fields, and present the result as a table. For example, consider the following object model: Package | v Device <- Component ^ ^ | | | Interface <---2--- Connection ^^^ / | \ / | \ / | \ Sampling IP Policy (The dash is the direction of a ForeignKey.) To produce a report, I chose a three-step process: 1. The user interfaces returns a list of fields to be included in the report, such as args = dict('Device.hostname__contains': 'localhost', 'Package.name__icontains': 'unix', 'IP.address__contains': '192') 2. Given the list of args, find the shortest path that connects all required models. For the example above, the result is a tuple: path = Device, Package, Interface, IP 3. In theory, I could now perform the following SQL request: SELECT * FROM myapp_device d LEFT JOIN myapp_package pa ON pa.device_id=d.id LEFT JOIN myapp_interface ifc ON ifc.device_id=d.id LEFT JOIN myapp_ip ip ON ip.interface_id=ifc.id; But of course, I want to avoid the raw SQL. I considered the following options: - Using Device.objects.select_related() does not work, because Device has a 1:n relation to Package (and also to Unit), which Django's select_related() does not support. - Using prefetch_related() does not work, because it prefetches everything, which is too much in our case (>100 million rows if a user queries on all tables), and it does not provide us with a total of the number of rows selected. In practice, I want to count(*) everything for displaying a total, and fetch only a subset, using LIMIT. Our tests showed that the raw SQL query with LEFT JOIN is fast enough for production, regardless of what fields and objects are being queried. The craziest query I built took about 20 seconds, which is ok for what we are trying to do. Any other options? -Samuel -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/ec0e2d37-9b0f-4623-8f60-c6feeef0eb9e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
reverse select_related()
I am using a unique=True foreign key relationship to model inheritance, as suggested by the documentation. Say a TypedItem model that has a FK to BaseItem. Now I am in a situation where it would be really nice to be able to access the typed item from within the base item, e.g. the one row in BaseItem.typeditem_set. Unfortunately, for a large number of BaseItems, that quickly starts to be come suboptimal, as the db is queried at least once for each, and select_related() only works the other way round, e.g. will not cache the reverse set. Any ideas on how to approach this? Thanks, Michael --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~--~~~~--~~--~--~---