Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

O'Toole, Eamonn Fri, 09 Nov 2012 07:16:05 -0800

Thanks again Michael.


> -----Original Message-----
> From: [email protected] [mailto:sqlite-users-
> [email protected]] On Behalf Of Black, Michael (IS)
> Sent: 09 November 2012 15:07
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
> 
> What I would do is find the max length of your data fields.
> Then dump the database, change the create table to use default values
> at those string lengths.
> Import it.
> 
> See what that does for you.  Or just reload your data the way you've
> been doing with the new default string lengths.
> 
> Also a compound index on name/created_at could help you a lot.
> 
> And since you're in a trigger I'm not sure if the insert or replace
> would help since that's essentially what you're doing anyways and I
> believe that's all wrapped inside a transaction inside triggers.
> Somebody please correct me if this is not true.
> 
> 
> 
> Michael D. Black
> Senior Scientist
> Advanced Analytics Directorate
> Advanced GEOINT Solutions Operating Unit
> Northrop Grumman Information Systems
> 
> ________________________________________
> From: [email protected] [[email protected]]
> on behalf of O'Toole, Eamonn [[email protected]]
> Sent: Friday, November 09, 2012 8:53 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite] Anomalously slow performance on updates to
> earlyentries in a DB
> 
> > -----Original Message-----
> > From: [email protected] [mailto:sqlite-users-
> > [email protected]] On Behalf Of Black, Michael (IS)
> > Sent: 09 November 2012 14:26
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] Anomalously slow performance on updates to
> > earlyentries in a DB
> >
> > >[O'Toole, Eamonn] This definitely sounds like it could be an issue.
> > There is just one container_stat entry >confirmed by sqlite3_analyzer
> > output which I'll post later).  So you're saying that the single
> > container_stat table :entry is potentially being relocated very
> > frequently the closer the update is to the beginning of the db?
> >
> > Yes....you didn't say what your data flow is...but since it's
> account-
> > based I assume you have a bunch of accounts that get preloaded.
> [O'Toole, Eamonn] In Swift the db is used to store information on the
> containers.  The hierarchy in Swift is account->container->object.  The
> test targets a specific container of a specific account, which contains
> information on 10 million objects, and changes the "created_at" field
> for the first 2 million entries.  First we create the container, add
> the 10 million entries to it, then we start the update cycle.  We see
> this slow performance on every update run, although if you run updates
> in succession without any break in between runs you do see an
> improvement in performance due to caching.
> 
> 
> > So the first n-thousand records are Size1.
> > You then start updating each of those...none of them are big
> > enough...the new records get inserted at the first available empty
> slot
> > (is that actually how this works or is there another row allocation
> > strategy?).
> [O'Toole, Eamonn] The SQL transactions are done through python.  This
> is the relevant section of code that deals with the object table, the
> container_stat table is updated by the triggers:
> 
>             for rec in item_list:
>                 query = '''
>                     DELETE FROM object
>                     WHERE name = ? AND (created_at < ?)
>                 '''
>                 if self.get_db_version(conn) >= 1:
>                     query += ' AND deleted IN (0, 1)'
>                 conn.execute(query, (rec['name'], rec['created_at']))
>                 query = 'SELECT 1 FROM object WHERE name = ?'
>                 if self.get_db_version(conn) >= 1:
>                     query += ' AND deleted IN (0, 1)'
>                 if not conn.execute(query, (rec['name'],)).fetchall():
>                     conn.execute('''
>                         INSERT INTO object (name, created_at, size,
>                             content_type, etag, deleted)
>                         VALUES (?, ?, ?, ?, ?, ?)
>                     ''', ([rec['name'], rec['created_at'], rec['size'],
>                           rec['content_type'], rec['etag'],
> rec['deleted']]))
> 
> 
> > Now you go to update those records again...they can't fit in the 1st
> > block...and some percentage of the data won't fit into the 2nd block
> > (depends on the variability in size).  So, let's say half the records
> > get relocated....eventually you reach homeostasis.
> >
> > Also...disk fragmentation could be affecting you too but I wouldn't
> > expect an order of magnitude difference on that.
> >
> [O'Toole, Eamonn] Disk fragmentation is definitely a factor, but as you
> say defragging doesn't get you an order of magnitude improvement.
> > Are you on Windows or Unix?
> [O'Toole, Eamonn] Linux (Ubuntu)
> >
> > Also...are your records indexed for the updates?
> [O'Toole, Eamonn] The only object table index that I can see is the
> ix_deleted_name index, and that isn't used by the table update logic.
> >
> >
> >
> > Michael D. Black
> > Senior Scientist
> > Advanced Analytics Directorate
> > Advanced GEOINT Solutions Operating Unit
> > Northrop Grumman Information Systems
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [email protected]
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Anomalously slow performance on updates to earlyentries in a DB

Reply via email to