Re: Instance Based Management?

2020-10-27 Thread Matthew Amstutz
Thanks Adam! I'll do some experimenting and see what I can come up with!
Also, I'll stop insulting myself lol. I wasn't expecting such polite
replies, everyone has been telling me that alot of developers are mean to
newbies.

 I've been coding for 2 years and using django for a year and a half; I've
got alot of ideas to try and make things easier for developers but
experimentation is going to be the most important thing for it. I'll try
and get some data before posting next time.

Thanks again!

On Tue, Oct 27, 2020, 11:11 AM Adam Johnson  wrote:

> Thanks again everyone, sorry for the ridiculously stupid post lol (I swear
>> I'm not 100% new)
>
>
> Self-insulting is not tolerated on this list! You're not stupid, nor was
> your post. It's an interesting problem to have, thank you for posting.
>
> If indexes aren't working, partitioning does seem to be the way to reach
> high scale, and perhaps Django could do more to support it. If you try it
> out and spot anything, do let us know if you can spot any improvements to
> Django.
>
>
>
> On Tue, 27 Oct 2020 at 14:57, Matthew Amstutz 
> wrote:
>
>> The idea was to automatically generate a table with auto generated fields
>> that gets arbitrarily connected to any content that can be spawned by
>> usersThen I realized this is pretty much an M2M field with extra steps
>> but it wouldn't let me delete the post lol. But I didn't know about table
>> partitioning so Thank you for the information. I know about ContentTypes
>> and such, I was just thinking it may be possible to cut some of the work
>> out for the developer so instead of explicitly declaring fields, they can
>> just access the existing one. Like I said, I realized this was stupid but
>> the group wouldn't let me delete it lol.  Thanks again for the information
>> and I only posted on this group because I thought the idea could be put in
>> the core just to cut some of the work out for developers.
>>
>> Thanks again everyone, sorry for the ridiculously stupid post lol (I
>> swear I'm not 100% new)
>>
>> On Tuesday, October 27, 2020 at 7:14:08 AM UTC-4 t...@tomforb.es wrote:
>>
>>> I think what Mathew really wants is support for table partitioning. You
>>> can get this right now with this library[1] for postgres. I’m not sure if
>>> this makes sense to add to core, however support is quite broad (MYSQL,
>>> MariaDB, Postgres and Oracle).
>>>
>>> 1.
>>> https://django-postgres-extra.readthedocs.io/en/master/table_partitioning.html
>>>
>>> Tom
>>>
>>> On 27 Oct 2020, at 05:30, Jure Erznožnik  wrote:
>>>
>>> 
>>>
>>> Hi Matthew,
>>>
>>> I think you found the wrong mailing list for this question. Might I
>>> suggest you try django...@googlegroups.com? The question seems better
>>> suited there.
>>>
>>> That said, I don't know why you wouldn't want to use foreign keys in
>>> this scenario, but Django does support a thing called content types
>>> <https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/> for
>>> what you seem to be suggesting. There's a section on that page called 
>>> "Generic
>>> relations
>>> <https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/#generic-relations>".
>>>
>>>
>>> Have a look.
>>>
>>> LP,
>>> Jure
>>> On 27. 10. 20 01:21, Matthew Amstutz wrote:
>>>
>>>
>>> Hello, I was wondering about instance based management. If I'm wrong,
>>> please tell me.
>>>
>>> When we have users and user generated content in a large database, query
>>> times are increased significantly. Why is there no instance based manager
>>> (like the models.Manager()) that basically generates a table for each user
>>> and queries ONLY that table? Would that not just flatten the database
>>> instead of increasing it's size? For example, if we have 1,000,000 users
>>> all of which generate at least 10 posts per day and one of the users only
>>> generates 5 in the span of 10 days, unless we have a many to many field or
>>> something to hold those five posts, the query time to find their posts
>>> would be ridiculous.
>>>
>>> So if we have a table generated for each user that holds arbitrary
>>> connections to anything they generate, it would in theory cut query times
>>> significantly. Why is there no feature like this? Again, if I'm wrong
>>> please 

Re: Instance Based Management?

2020-10-27 Thread Adam Johnson
>
> Thanks again everyone, sorry for the ridiculously stupid post lol (I swear
> I'm not 100% new)


Self-insulting is not tolerated on this list! You're not stupid, nor was
your post. It's an interesting problem to have, thank you for posting.

If indexes aren't working, partitioning does seem to be the way to reach
high scale, and perhaps Django could do more to support it. If you try it
out and spot anything, do let us know if you can spot any improvements to
Django.



On Tue, 27 Oct 2020 at 14:57, Matthew Amstutz 
wrote:

> The idea was to automatically generate a table with auto generated fields
> that gets arbitrarily connected to any content that can be spawned by
> usersThen I realized this is pretty much an M2M field with extra steps
> but it wouldn't let me delete the post lol. But I didn't know about table
> partitioning so Thank you for the information. I know about ContentTypes
> and such, I was just thinking it may be possible to cut some of the work
> out for the developer so instead of explicitly declaring fields, they can
> just access the existing one. Like I said, I realized this was stupid but
> the group wouldn't let me delete it lol.  Thanks again for the information
> and I only posted on this group because I thought the idea could be put in
> the core just to cut some of the work out for developers.
>
> Thanks again everyone, sorry for the ridiculously stupid post lol (I swear
> I'm not 100% new)
>
> On Tuesday, October 27, 2020 at 7:14:08 AM UTC-4 t...@tomforb.es wrote:
>
>> I think what Mathew really wants is support for table partitioning. You
>> can get this right now with this library[1] for postgres. I’m not sure if
>> this makes sense to add to core, however support is quite broad (MYSQL,
>> MariaDB, Postgres and Oracle).
>>
>> 1.
>> https://django-postgres-extra.readthedocs.io/en/master/table_partitioning.html
>>
>> Tom
>>
>> On 27 Oct 2020, at 05:30, Jure Erznožnik  wrote:
>>
>> 
>>
>> Hi Matthew,
>>
>> I think you found the wrong mailing list for this question. Might I
>> suggest you try django...@googlegroups.com? The question seems better
>> suited there.
>>
>> That said, I don't know why you wouldn't want to use foreign keys in this
>> scenario, but Django does support a thing called content types
>> <https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/> for
>> what you seem to be suggesting. There's a section on that page called 
>> "Generic
>> relations
>> <https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/#generic-relations>".
>>
>>
>> Have a look.
>>
>> LP,
>> Jure
>> On 27. 10. 20 01:21, Matthew Amstutz wrote:
>>
>>
>> Hello, I was wondering about instance based management. If I'm wrong,
>> please tell me.
>>
>> When we have users and user generated content in a large database, query
>> times are increased significantly. Why is there no instance based manager
>> (like the models.Manager()) that basically generates a table for each user
>> and queries ONLY that table? Would that not just flatten the database
>> instead of increasing it's size? For example, if we have 1,000,000 users
>> all of which generate at least 10 posts per day and one of the users only
>> generates 5 in the span of 10 days, unless we have a many to many field or
>> something to hold those five posts, the query time to find their posts
>> would be ridiculous.
>>
>> So if we have a table generated for each user that holds arbitrary
>> connections to anything they generate, it would in theory cut query times
>> significantly. Why is there no feature like this? Again, if I'm wrong
>> please tell me but the amount of tables doesn't matter and instead the data
>> they hold does so, in my understanding, 1,000,000,000 posts will always be
>> the size of 1,000,000,000 posts no matter their organization.
>>
>> I've got ideas on implementation and even asyncronous supports as well as
>> customization but I have no idea how to bring this up to the django
>> developers and I'm not even sure it would work (though, no matter how hard
>> I try, I can't see anything wrong with it).
>>
>> Let me know your input and if there's a way I can ask the django devs
>> about this and possibly even suggest a few things pertaining to it. I'd
>> like to help make django the best it can be and if this works and we can
>> implement it, django will be very fast with user generated content.
>> --
>> You received this

Re: Instance Based Management?

2020-10-27 Thread Matthew Amstutz
The idea was to automatically generate a table with auto generated fields 
that gets arbitrarily connected to any content that can be spawned by 
usersThen I realized this is pretty much an M2M field with extra steps 
but it wouldn't let me delete the post lol. But I didn't know about table 
partitioning so Thank you for the information. I know about ContentTypes 
and such, I was just thinking it may be possible to cut some of the work 
out for the developer so instead of explicitly declaring fields, they can 
just access the existing one. Like I said, I realized this was stupid but 
the group wouldn't let me delete it lol.  Thanks again for the information 
and I only posted on this group because I thought the idea could be put in 
the core just to cut some of the work out for developers.

Thanks again everyone, sorry for the ridiculously stupid post lol (I swear 
I'm not 100% new)

On Tuesday, October 27, 2020 at 7:14:08 AM UTC-4 t...@tomforb.es wrote:

> I think what Mathew really wants is support for table partitioning. You 
> can get this right now with this library[1] for postgres. I’m not sure if 
> this makes sense to add to core, however support is quite broad (MYSQL, 
> MariaDB, Postgres and Oracle).
>
> 1. 
> https://django-postgres-extra.readthedocs.io/en/master/table_partitioning.html
>
> Tom
>
> On 27 Oct 2020, at 05:30, Jure Erznožnik  wrote:
>
>  
>
> Hi Matthew,
>
> I think you found the wrong mailing list for this question. Might I 
> suggest you try django...@googlegroups.com? The question seems better 
> suited there.
>
> That said, I don't know why you wouldn't want to use foreign keys in this 
> scenario, but Django does support a thing called content types 
> <https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/> for 
> what you seem to be suggesting. There's a section on that page called 
> "Generic 
> relations 
> <https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/#generic-relations>".
>  
>
>
> Have a look.
>
> LP,
> Jure
> On 27. 10. 20 01:21, Matthew Amstutz wrote:
>
>
> Hello, I was wondering about instance based management. If I'm wrong, 
> please tell me. 
>
> When we have users and user generated content in a large database, query 
> times are increased significantly. Why is there no instance based manager 
> (like the models.Manager()) that basically generates a table for each user 
> and queries ONLY that table? Would that not just flatten the database 
> instead of increasing it's size? For example, if we have 1,000,000 users 
> all of which generate at least 10 posts per day and one of the users only 
> generates 5 in the span of 10 days, unless we have a many to many field or 
> something to hold those five posts, the query time to find their posts 
> would be ridiculous.
>
> So if we have a table generated for each user that holds arbitrary 
> connections to anything they generate, it would in theory cut query times 
> significantly. Why is there no feature like this? Again, if I'm wrong 
> please tell me but the amount of tables doesn't matter and instead the data 
> they hold does so, in my understanding, 1,000,000,000 posts will always be 
> the size of 1,000,000,000 posts no matter their organization.
>
> I've got ideas on implementation and even asyncronous supports as well as 
> customization but I have no idea how to bring this up to the django 
> developers and I'm not even sure it would work (though, no matter how hard 
> I try, I can't see anything wrong with it). 
>
> Let me know your input and if there's a way I can ask the django devs 
> about this and possibly even suggest a few things pertaining to it. I'd 
> like to help make django the best it can be and if this works and we can 
> implement it, django will be very fast with user generated content.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-develop...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to d

Re: Instance Based Management?

2020-10-27 Thread Tom Forbes
I think what Mathew really wants is support for table partitioning. You can get 
this right now with this library[1] for postgres. I’m not sure if this makes 
sense to add to core, however support is quite broad (MYSQL, MariaDB, Postgres 
and Oracle).

1. 
https://django-postgres-extra.readthedocs.io/en/master/table_partitioning.html

Tom

> On 27 Oct 2020, at 05:30, Jure Erznožnik  wrote:
> 
> 
> Hi Matthew,
> 
> I think you found the wrong mailing list for this question. Might I suggest 
> you try django-us...@googlegroups.com? The question seems better suited there.
> 
> That said, I don't know why you wouldn't want to use foreign keys in this 
> scenario, but Django does support a thing called content types for what you 
> seem to be suggesting. There's a section on that page called "Generic 
> relations". 
> 
> Have a look.
> 
> LP,
> Jure
> 
> On 27. 10. 20 01:21, Matthew Amstutz wrote:
>> 
>> Hello, I was wondering about instance based management. If I'm wrong, please 
>> tell me. 
>> 
>> When we have users and user generated content in a large database, query 
>> times are increased significantly. Why is there no instance based manager 
>> (like the models.Manager()) that basically generates a table for each user 
>> and queries ONLY that table? Would that not just flatten the database 
>> instead of increasing it's size? For example, if we have 1,000,000 users all 
>> of which generate at least 10 posts per day and one of the users only 
>> generates 5 in the span of 10 days, unless we have a many to many field or 
>> something to hold those five posts, the query time to find their posts would 
>> be ridiculous.
>> 
>> So if we have a table generated for each user that holds arbitrary 
>> connections to anything they generate, it would in theory cut query times 
>> significantly. Why is there no feature like this? Again, if I'm wrong please 
>> tell me but the amount of tables doesn't matter and instead the data they 
>> hold does so, in my understanding, 1,000,000,000 posts will always be the 
>> size of 1,000,000,000 posts no matter their organization.
>> 
>> I've got ideas on implementation and even asyncronous supports as well as 
>> customization but I have no idea how to bring this up to the django 
>> developers and I'm not even sure it would work (though, no matter how hard I 
>> try, I can't see anything wrong with it). 
>> 
>> Let me know your input and if there's a way I can ask the django devs about 
>> this and possibly even suggest a few things pertaining to it. I'd like to 
>> help make django the best it can be and if this works and we can implement 
>> it, django will be very fast with user generated content.
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-developers+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/944bfa26-a0bf-69bb-f76a-c0654910eb20%40gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/BA1C3DF2-3BD1-473F-BA01-016ACA251D5D%40tomforb.es.


Re: Instance Based Management?

2020-10-26 Thread Jure Erznožnik

Hi Matthew,

I think you found the wrong mailing list for this question. Might I 
suggest you try django-us...@googlegroups.com? The question seems better 
suited there.


That said, I don't know why you wouldn't want to use foreign keys in 
this scenario, but Django does support a thing called content types 
<https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/> for 
what you seem to be suggesting. There's a section on that page called 
"Generic relations 
<https://docs.djangoproject.com/en/3.1/ref/contrib/contenttypes/#generic-relations>". 



Have a look.

LP,
Jure

On 27. 10. 20 01:21, Matthew Amstutz wrote:


Hello, I was wondering about instance based management. If I'm wrong, 
please tell me.


When we have users and user generated content in a large database, 
query times are increased significantly. Why is there no instance 
based manager (like the models.Manager()) that basically generates a 
table for each user and queries ONLY that table? Would that not just 
flatten the database instead of increasing it's size? For example, if 
we have 1,000,000 users all of which generate at least 10 posts per 
day and one of the users only generates 5 in the span of 10 days, 
unless we have a many to many field or something to hold those five 
posts, the query time to find their posts would be ridiculous.


So if we have a table generated for each user that holds arbitrary 
connections to anything they generate, it would in theory cut query 
times significantly. Why is there no feature like this? Again, if I'm 
wrong please tell me but the amount of tables doesn't matter and 
instead the data they hold does so, in my understanding, 1,000,000,000 
posts will always be the size of 1,000,000,000 posts no matter their 
organization.


I've got ideas on implementation and even asyncronous supports as well 
as customization but I have no idea how to bring this up to the django 
developers and I'm not even sure it would work (though, no matter how 
hard I try, I can't see anything wrong with it).


Let me know your input and if there's a way I can ask the django devs 
about this and possibly even suggest a few things pertaining to it. 
I'd like to help make django the best it can be and if this works and 
we can implement it, django will be very fast with user generated content.

--
You received this message because you are subscribed to the Google 
Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to django-developers+unsubscr...@googlegroups.com 
<mailto:django-developers+unsubscr...@googlegroups.com>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com 
<https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/944bfa26-a0bf-69bb-f76a-c0654910eb20%40gmail.com.


Re: Instance Based Management?

2020-10-26 Thread Daryl
What you are describing sounds like denormalisation, but the first step
would be to ensure that the indexes on the child tables are being created.

The query time to find rows if the lookup field is indexed is close to
linear, without an index it will be close to exponential.
You should only denormalise after you have investigated the root cause of
your performance issues. Most times, denormalization is a form of premature
optimization [ https://stackify.com/premature-optimization-evil/ ]
If you are talking about virtual denormalisation (ie a view or table
generated by the manager you mention) then this relies again on indexing of
the underlying data to work, so you get no performance gain.
Correct me if I'm wrong, but AFAIK this is true of most managers - they
help you with your logical view of the data, not the performance of
retrieving it.

There are many django deployments where the performance is fine with orders
of magnitude more data than you are referring to, it just takes careful
planning.

D


On Tue, 27 Oct 2020 at 13:40, Matthew Amstutz 
wrote:

>
> Hello, I was wondering about instance based management. If I'm wrong,
> please tell me.
>
> When we have users and user generated content in a large database, query
> times are increased significantly. Why is there no instance based manager
> (like the models.Manager()) that basically generates a table for each user
> and queries ONLY that table? Would that not just flatten the database
> instead of increasing it's size? For example, if we have 1,000,000 users
> all of which generate at least 10 posts per day and one of the users only
> generates 5 in the span of 10 days, unless we have a many to many field or
> something to hold those five posts, the query time to find their posts
> would be ridiculous.
>
> So if we have a table generated for each user that holds arbitrary
> connections to anything they generate, it would in theory cut query times
> significantly. Why is there no feature like this? Again, if I'm wrong
> please tell me but the amount of tables doesn't matter and instead the data
> they hold does so, in my understanding, 1,000,000,000 posts will always be
> the size of 1,000,000,000 posts no matter their organization.
>
> I've got ideas on implementation and even asyncronous supports as well as
> customization but I have no idea how to bring this up to the django
> developers and I'm not even sure it would work (though, no matter how hard
> I try, I can't see anything wrong with it).
>
> Let me know your input and if there's a way I can ask the django devs
> about this and possibly even suggest a few things pertaining to it. I'd
> like to help make django the best it can be and if this works and we can
> implement it, django will be very fast with user generated content.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
-- 
==
Daryl Egarr,  Director
Kawhai Consultants Ltd
Cell   021 521 353
da...@kawhai.net
==

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CALzH9quw5tSNNuAe%2Bspt9edTcJWV_ckUqdiyEtW5Qybx9kFepQ%40mail.gmail.com.


Instance Based Management?

2020-10-26 Thread Matthew Amstutz

Hello, I was wondering about instance based management. If I'm wrong, 
please tell me. 

When we have users and user generated content in a large database, query 
times are increased significantly. Why is there no instance based manager 
(like the models.Manager()) that basically generates a table for each user 
and queries ONLY that table? Would that not just flatten the database 
instead of increasing it's size? For example, if we have 1,000,000 users 
all of which generate at least 10 posts per day and one of the users only 
generates 5 in the span of 10 days, unless we have a many to many field or 
something to hold those five posts, the query time to find their posts 
would be ridiculous.

So if we have a table generated for each user that holds arbitrary 
connections to anything they generate, it would in theory cut query times 
significantly. Why is there no feature like this? Again, if I'm wrong 
please tell me but the amount of tables doesn't matter and instead the data 
they hold does so, in my understanding, 1,000,000,000 posts will always be 
the size of 1,000,000,000 posts no matter their organization.

I've got ideas on implementation and even asyncronous supports as well as 
customization but I have no idea how to bring this up to the django 
developers and I'm not even sure it would work (though, no matter how hard 
I try, I can't see anything wrong with it). 

Let me know your input and if there's a way I can ask the django devs about 
this and possibly even suggest a few things pertaining to it. I'd like to 
help make django the best it can be and if this works and we can implement 
it, django will be very fast with user generated content.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7e36ded7-2f3d-43c2-881c-cbc75c80b5c2n%40googlegroups.com.