Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
Hello, Thanks for your advice! It works pretty well when I dropped the rowid and set the other field as primary key. I also enable wal mode at the cost of a little bit read performance. Overall, it works really well now. Really appreciate your help. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
Thanks for the advice! I could never thought about this. Really learnt a lot. It's a customized Android phone type device. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE DS = 15;". Not zipvfs specific, but that kind of update can be quite inefficient if the record is large (in terms of bytes, or records per database page) as your table declaration hints to. This will be especially prominent with compression, as a single byte changed early in the uncompressed stream can lead to changing pretty much every byte of the compressed data, and so is bound to defeat any optimizations the storage could have to handle unmodified pages and data. Depending on your update patterns and frequencies, it could thus be beneficial to split your table into two tables (or more) with a common primary key: - one table for the rarely changed fields (even better if they are immutable) - one table (or more) for the fields that change often This will increase the database size and reduce insert performance, and it can complicate querying when you need everything (extra inner joins), but it can help reduce the amount of raw data that is affected by updates quite drastically. I have been able to use the above strategy sucessfully, however it was under Windows compressed folders, so YMMV. On Thu, Sep 7, 2017 at 9:13 PM, Yue Wu wrote: > Hello, > As mentioned in the subject, our goal is to improve performance regarding > to batch sql updates. > The update sql as follow, > > > UPDATE ITEM SET FIELD4 =? WHERE DS=? > > We run 100,000 updates in a single transaction. The zipvfs version takes > about 20 min while uncompressed version takes about 7 min. > I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE DS > = 15;". > > Any suggestions is very helpful. > > Some background: > We have an app running on Android 4.2 using zipvfs. > The page size is 4096 > The cache size is - 2000 > > Table schema for ITEM > > > CREATE TABLE ITEM ( > > FIELD0 NUMERIC, > > FIELD1 NUMERIC, > > DS TEXT, > > FIELD2 TEXT, > > FIELD3 TEXT, > > FIELD4 NUMERIC, > > FIELD5 NUMERIC, > > FIELD6 NUMERIC, > > FIELD7 NUMERIC, > > FIELD8 NUMERIC, > > FIELD9 NUMERIC, > > FIELD10 NUMERIC, > > FIELD11 TEXT); > > > The third column: "DS" is what we query by almost all the time. We also > created index: > > > CREATE INDEX DS_INDEX ON ITEM(DS); > > > There are 30 million records. Zipvfs size of 471mb vs uncompressed db (the > one shipped with Android 4.2) size of 1.39gb. > Zipvfs db using zlib and aes128, which are default. > > Thanks > -- > > Yue Wu | Android Developer > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
On 9 Sep 2017, at 9:11pm, Teg wrote: > Is it a phone or tablet? I'd wonder if the CPU is going into thermal > throttle mode. As I understand it most phones and tablets can't keep > the CPU's running full speed without eventually throttling the CPU. I would never have thought of that unless I had the device in my hands. Well done. Yes. You’re trying to process 100,000 updates, and your device is working flat out for 7 minutes, or for 20 minutes. How hot is it ? If you do it unplugged, how much does that drain the battery ? I don’t have experience with Android but I can tell you that iOS will have to throttle any process that attempts to do process data continuously for 7 minutes. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
Hello Yue, Is it a phone or tablet? I'd wonder if the CPU is going into thermal throttle mode. As I understand it most phones and tablets can't keep the CPU's running full speed without eventually throttling the CPU. To me the fact uncompressed, while moving much more data, is more than 2 times faster suggests the bottleneck isn't the fileIO. Thursday, September 7, 2017, 3:13:04 PM, you wrote: YW> Hello, YW> As mentioned in the subject, our goal is to improve performance regarding YW> to batch sql updates. YW> The update sql as follow, >> UPDATE ITEM SET FIELD4 =? WHERE DS=? YW> We run 100,000 updates in a single transaction. The zipvfs version takes YW> about 20 min while uncompressed version takes about 7 min. YW> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE DS YW> = 15;". YW> Any suggestions is very helpful. YW> Some background: YW> We have an app running on Android 4.2 using zipvfs. YW> The page size is 4096 YW> The cache size is - 2000 YW> Table schema for ITEM >> CREATE TABLE ITEM ( >> FIELD0 NUMERIC, >> FIELD1 NUMERIC, >> DS TEXT, >> FIELD2 TEXT, >> FIELD3 TEXT, >> FIELD4 NUMERIC, >> FIELD5 NUMERIC, >> FIELD6 NUMERIC, >> FIELD7 NUMERIC, >> FIELD8 NUMERIC, >> FIELD9 NUMERIC, >> FIELD10 NUMERIC, >> FIELD11 TEXT); YW> The third column: "DS" is what we query by almost all the time. We also YW> created index: >> CREATE INDEX DS_INDEX ON ITEM(DS); YW> There are 30 million records. Zipvfs size of 471mb vs uncompressed db (the YW> one shipped with Android 4.2) size of 1.39gb. YW> Zipvfs db using zlib and aes128, which are default. YW> Thanks -- Tegmailto:t...@djii.com ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
Thanks for the advice. I'll try them now. -- Sent from: http://sqlite.1065341.n5.nabble.com/ ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
Are your updates sorted by DS? If your queries are sorted then sequential queries are more likely to hit the same db pages while searching the index, resulting in higher cache usage and fewer decompression operations. This would have less benefit if your 100k DS values of the updates are randomly distributed through the 30m available, and more of an effect if they're tightly clustered in some range. If you mainly query by DS, is that the same as saying it is your primary key? If so, have you tried to declare it so and try the table WITHOUT ROWID, and don't bother with the index? It may help since you wouldn't have to decompress both the index pages and the table pages. > On 8 Sep 2017, at 12:33 pm, Dominique Pellé wrote: > > Yue Wu wrote: > >> Hello, >> As mentioned in the subject, our goal is to improve performance regarding >> to batch sql updates. >> The update sql as follow, >> >>> UPDATE ITEM SET FIELD4 =? WHERE DS=? >> >> We run 100,000 updates in a single transaction. The zipvfs version takes >> about 20 min while uncompressed version takes about 7 min. > > Which compression algorithm do you use with zipvfs? > Try LZ4, it's is times faster than zlib at compression and > decompression, at the cost of compressing a bit less: > > https://github.com/lz4/lz4 > > Or try zstd, which can compress more than zlib and faster (especially > on 64-bit architectures), but it's not as fast as LZ4: > > http://facebook.github.io/zstd/ > > What is also your SQLite page size? > Reducing the SQLite page size probably helps to speed up > updates, since zipvfs compresses by pages. > > Dominique > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs
Yue Wu wrote: > Hello, > As mentioned in the subject, our goal is to improve performance regarding > to batch sql updates. > The update sql as follow, > >> UPDATE ITEM SET FIELD4 =? WHERE DS=? > > We run 100,000 updates in a single transaction. The zipvfs version takes > about 20 min while uncompressed version takes about 7 min. Which compression algorithm do you use with zipvfs? Try LZ4, it's is times faster than zlib at compression and decompression, at the cost of compressing a bit less: https://github.com/lz4/lz4 Or try zstd, which can compress more than zlib and faster (especially on 64-bit architectures), but it's not as fast as LZ4: http://facebook.github.io/zstd/ What is also your SQLite page size? Reducing the SQLite page size probably helps to speed up updates, since zipvfs compresses by pages. Dominique ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users