Re: [openstack-dev] [glance] Why no DB index on sort parameters
Now that raises a question: do we really need sorting based on arbitrary keys in our API (e.g. listing images, volumes, instances)? If we have this feature in our API, we're bound to run into problems by creating or not creating indexes, at large volumes -- hurts our motive to be easily-implementable for clouds of all sizes. -Rushi On 23 April 2015 at 20:40, Nikhil Komawar nikhil.koma...@rackspace.com wrote: Messing with indices is not a good idea to do iteratively. Indexing large data sets is a really expensive operation and should be done carefully and consistently. Changing around indices is only going to make things unstable. Thanks, -Nikhil From: Flavio Percoco fla...@redhat.com Sent: Thursday, April 23, 2015 7:52 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:55 +, Nikhil Komawar wrote: Rally is great overall however, we need good EXPLAIN examples on real world data. Smaller deployments might benefit from a simple sample performance analysis however, larger data sets can have impacts on areas that you never expect. A spec means that we document the indices proposed in the code base, based on all of the use cases. The way I look at it, a patch is needed anyways and it (rally gate job) would get attention from reviewers when the patch is proposed. Yes, I believe we need both. However, I'd probably just start with something smaller and see how it behaves before going with big data sets. I'm not saying we don't need tests with proper data sets, I'm saying that I'd probably start with smaller ones. As Mike already mentioned in his email, there's an impact in writes and we can see that from Rally tests, AFAICT. The spec can come later, IMHO. Cheers, Flavio From: Flavio Percoco fla...@redhat.com Sent: Tuesday, April 21, 2015 10:48 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:39 +, Nikhil Komawar wrote: This is a good idea. We recently removed a unique constraint that may result into some queries being very slow especially those that involve name property. I would recommend sketching out a spec that identifies potential full table scans especially for queries that join over image_properties table. We should discuss there what other use cases look like rather than smaller feedback on the ML. More thatn a spec, I'd be interested in seeing the patch with the change up and the results reported in Rally. I guess we'll need a spec anyway, although I'd probably be ok with a good bug report here. /me *shrugs* Flavio Thanks, -Nikhil ━━━ From: Mike Bayer mba...@redhat.com Sent: Tuesday, April 21, 2015 9:45 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo
Re: [openstack-dev] [glance] Why no DB index on sort parameters
At the very least, an index on the default sort column (created_at) would be appropriate, IMO. Best, -jay On 04/27/2015 01:42 PM, Rushi Agrawal wrote: Now that raises a question: do we really need sorting based on arbitrary keys in our API (e.g. listing images, volumes, instances)? If we have this feature in our API, we're bound to run into problems by creating or not creating indexes, at large volumes -- hurts our motive to be easily-implementable for clouds of all sizes. -Rushi On 23 April 2015 at 20:40, Nikhil Komawar nikhil.koma...@rackspace.com mailto:nikhil.koma...@rackspace.com wrote: Messing with indices is not a good idea to do iteratively. Indexing large data sets is a really expensive operation and should be done carefully and consistently. Changing around indices is only going to make things unstable. Thanks, -Nikhil From: Flavio Percoco fla...@redhat.com mailto:fla...@redhat.com Sent: Thursday, April 23, 2015 7:52 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:55 +, Nikhil Komawar wrote: Rally is great overall however, we need good EXPLAIN examples on real world data. Smaller deployments might benefit from a simple sample performance analysis however, larger data sets can have impacts on areas that you never expect. A spec means that we document the indices proposed in the code base, based on all of the use cases. The way I look at it, a patch is needed anyways and it (rally gate job) would get attention from reviewers when the patch is proposed. Yes, I believe we need both. However, I'd probably just start with something smaller and see how it behaves before going with big data sets. I'm not saying we don't need tests with proper data sets, I'm saying that I'd probably start with smaller ones. As Mike already mentioned in his email, there's an impact in writes and we can see that from Rally tests, AFAICT. The spec can come later, IMHO. Cheers, Flavio From: Flavio Percoco fla...@redhat.com mailto:fla...@redhat.com Sent: Tuesday, April 21, 2015 10:48 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:39 +, Nikhil Komawar wrote: This is a good idea. We recently removed a unique constraint that may result into some queries being very slow especially those that involve name property. I would recommend sketching out a spec that identifies potential full table scans especially for queries that join over image_properties table. We should discuss there what other use cases look like rather than smaller feedback on the ML. More thatn a spec, I'd be interested in seeing the patch with the change up and the results reported in Rally. I guess we'll need a spec anyway, although I'd probably be ok with a good bug report here. /me *shrugs* Flavio Thanks, -Nikhil ━━━ From: Mike Bayer mba...@redhat.com mailto:mba...@redhat.com Sent: Tuesday, April 21, 2015 9:45 AM To: openstack-dev@lists.openstack.org mailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack
Re: [openstack-dev] [glance] Why no DB index on sort parameters
Messing with indices is not a good idea to do iteratively. Indexing large data sets is a really expensive operation and should be done carefully and consistently. Changing around indices is only going to make things unstable. Thanks, -Nikhil From: Flavio Percoco fla...@redhat.com Sent: Thursday, April 23, 2015 7:52 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:55 +, Nikhil Komawar wrote: Rally is great overall however, we need good EXPLAIN examples on real world data. Smaller deployments might benefit from a simple sample performance analysis however, larger data sets can have impacts on areas that you never expect. A spec means that we document the indices proposed in the code base, based on all of the use cases. The way I look at it, a patch is needed anyways and it (rally gate job) would get attention from reviewers when the patch is proposed. Yes, I believe we need both. However, I'd probably just start with something smaller and see how it behaves before going with big data sets. I'm not saying we don't need tests with proper data sets, I'm saying that I'd probably start with smaller ones. As Mike already mentioned in his email, there's an impact in writes and we can see that from Rally tests, AFAICT. The spec can come later, IMHO. Cheers, Flavio From: Flavio Percoco fla...@redhat.com Sent: Tuesday, April 21, 2015 10:48 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:39 +, Nikhil Komawar wrote: This is a good idea. We recently removed a unique constraint that may result into some queries being very slow especially those that involve name property. I would recommend sketching out a spec that identifies potential full table scans especially for queries that join over image_properties table. We should discuss there what other use cases look like rather than smaller feedback on the ML. More thatn a spec, I'd be interested in seeing the patch with the change up and the results reported in Rally. I guess we'll need a spec anyway, although I'd probably be ok with a good bug report here. /me *shrugs* Flavio Thanks, -Nikhil ━━━ From: Mike Bayer mba...@redhat.com Sent: Tuesday, April 21, 2015 9:45 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [glance] Why no DB index on sort parameters
On 21/04/15 14:55 +, Nikhil Komawar wrote: Rally is great overall however, we need good EXPLAIN examples on real world data. Smaller deployments might benefit from a simple sample performance analysis however, larger data sets can have impacts on areas that you never expect. A spec means that we document the indices proposed in the code base, based on all of the use cases. The way I look at it, a patch is needed anyways and it (rally gate job) would get attention from reviewers when the patch is proposed. Yes, I believe we need both. However, I'd probably just start with something smaller and see how it behaves before going with big data sets. I'm not saying we don't need tests with proper data sets, I'm saying that I'd probably start with smaller ones. As Mike already mentioned in his email, there's an impact in writes and we can see that from Rally tests, AFAICT. The spec can come later, IMHO. Cheers, Flavio From: Flavio Percoco fla...@redhat.com Sent: Tuesday, April 21, 2015 10:48 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 21/04/15 14:39 +, Nikhil Komawar wrote: This is a good idea. We recently removed a unique constraint that may result into some queries being very slow especially those that involve name property. I would recommend sketching out a spec that identifies potential full table scans especially for queries that join over image_properties table. We should discuss there what other use cases look like rather than smaller feedback on the ML. More thatn a spec, I'd be interested in seeing the patch with the change up and the results reported in Rally. I guess we'll need a spec anyway, although I'd probably be ok with a good bug report here. /me *shrugs* Flavio Thanks, -Nikhil ━━━ From: Mike Bayer mba...@redhat.com Sent: Tuesday, April 21, 2015 9:45 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco pgpL5YJyWI2GL.pgp Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [glance] Why no DB index on sort parameters
Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [glance] Why no DB index on sort parameters
On 21/04/15 12:17 +0530, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. I don't think there's an actual reason, OTOH. I'd recommend proposing a patch and wait for the reviews there. :) Flavio -- @flaper87 Flavio Percoco pgp3yiNZiCDnJ.pgp Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [glance] Why no DB index on sort parameters
This is a good idea. We recently removed a unique constraint that may result into some queries being very slow especially those that involve name property. I would recommend sketching out a spec that identifies potential full table scans especially for queries that join over image_properties table. We should discuss there what other use cases look like rather than smaller feedback on the ML. Thanks, -Nikhil From: Mike Bayer mba...@redhat.com Sent: Tuesday, April 21, 2015 9:45 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribemailto:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [glance] Why no DB index on sort parameters
On 21/04/15 14:39 +, Nikhil Komawar wrote: This is a good idea. We recently removed a unique constraint that may result into some queries being very slow especially those that involve name property. I would recommend sketching out a spec that identifies potential full table scans especially for queries that join over image_properties table. We should discuss there what other use cases look like rather than smaller feedback on the ML. More thatn a spec, I'd be interested in seeing the patch with the change up and the results reported in Rally. I guess we'll need a spec anyway, although I'd probably be ok with a good bug report here. /me *shrugs* Flavio Thanks, -Nikhil ━━━ From: Mike Bayer mba...@redhat.com Sent: Tuesday, April 21, 2015 9:45 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- @flaper87 Flavio Percoco pgpwzvNpPX67O.pgp Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [glance] Why no DB index on sort parameters
On 4/21/15 2:47 AM, Ajaya Agrawal wrote: Hi All, I see that glance supports arbitrary sort parameters and the default is created_at while listing images. Is there any reason why we don't have index over these fields? If we have an index over these fields then we would avoid a full table scan to do sorting. IMO at least the created_at field should have an index on it. just keep in mind that more indexes will place a performance penalty on INSERT statements, particularly at larger volumes. I have no idea if that is important here but something to keep in mind. Cheers, Ajaya __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev