Re: [sqlite] 100% CPU utilization sqlite running in VMWare
ESXI 5 Disk is on a local host, running raid 6 on a LSI raid card Machine has HT turned off so only scheduling full cores System takes up less than 500mb of RAM and uses less than 1% of CPU Intel Xeon X5482 3.2Ghz 64GB RAM 6TB Raid 6 10GB NIC VMXNET NIC#2 Windows Server 2008 R2 Dedicated Development Box VM #1: 8 Core 6GB (Writes SQLITE DB's to C and shares directory) VM#2: 4 Core 4GB (Writes SQLITE DB's out to Share on VM#1) The Tick Tapes are on 40TB NAS Looking through the code I noticed sqlite3_exec(db, zSQL, 0, 0, 0); wrapped inside transaction. Suspect this is causing the CPU load, going to try recoding using prepared statements. Response to Keith Medalf What version of ESX and where is your disk coming from? Are you scheduling full cores or fake cores? What is the split between USER/SYSTEM CPU on the VM when it tanks? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 100% CPU utilization sqlite running in VMWare
There was a significant increase in throughput converting the sqlite3_exec inserts into prepared statements: I suspect every previous insert was compiling versus now only 1 of every 10,000 inserts. I'll bump this up to 100K next revision I've changed many variables so its hard to base line the differential: PRAGMA synchronous=OFF PRAGMA count_changes=OFF PRAGMA page_size=32768 PRAGMA temp_store=MEMORY PRAGMA journal_mode=OFF Library compiled with # define SQLITE_THREADSAFE 0 The CPU is still pegged at 100% but the extraction databases appear to be generating much faster. Currently testing 30 extractors running on a 4 core VM. Reply to Simon There's nothing wrong with sqlite3_exec that will causes excessive CPU usage, but the above usage of it suggests someone put that together quickly. The last three parameters should be more like NULL, 0, zErrMsg showing no callback, a dummy argument for the non-existant callback, and somewhere for SQLite to put an error message if there is one. Even if you burst the _exec() into _prepare(), _step(), _finalize(), you'll still have to provide the equivalent parameters somewhere. I don't know whether using 0, 0, 0 when there is no error will cause any problems. I suspect not. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 100% CPU utilization sqlite running in VMWare
No indexes... create db and table structure - process tick tapes inserting records - than create index. Windows 2008 R2 Servers ... We can throw hardware at this problem. We need to be able to run 1000's of extractors concurrently processing different tick tapes and symbol sets. aka service bureau. The Daily tick tapes are approx 20gb each.. 30TB repository and growing. An extraction run take 1 - 5 minutes for small symbol sets per tape. An example would be to concurrently extract 5 years of a particular stock's tick data. 1500 days x 2 minute extraction jobs across 15 VMs each running 100 extractors. I'll try journal mode off and increasing page size. Response to: So you want to minimize the cpu usage? It looks like your os is running on the last core, it has work to do too ;) Apart form using directly a hard disk instead a vdmk virtual hard disk, i don't know other tricks on vmware sorry. Did you create the indexs before the mass inserts? If yes, sqlite is reindexing after your 10.000 inserts. Create the tables without any indexes, and recreate them after the bulk insert. Set pragma automatic_index=FALSE too to disable automatic index creation on primary key on the not indexed table. Calculate how much database size increase happens when you insert the 10.000 rows. If you are using wal mode, set wal file size to this size plus a bit more. This way only one wal file will be used. Other journal modes (try off mode) may be faster but riskier. 9MB of memory, perhaps the system is copying data between kernel structures and sqlite, like disk cache entries (I don't know what os are you using, it's a blind shoot) try to increase the pragma cache page. Increase page size to 8K-32K, this way sqlite will reduce the amount of administrative internal work on pages. Of course, read the documentation, and if you have any cuestion on it ask. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 100% CPU utilization sqlite running in VMWare
Running a single extractor on a VM runs near bare metal speeds. I can run concurrent extractors up to the # of cores allocated to the VM and sustain decent throughput. Once I go above this count the VM's get crippled.. Pegged at 100% - Disk, RAM and LAN loads are miniscule. For a test we ran 50 extractors across 2 VM's (8 core and 4 core) and it took 14 hours to extract a single symbol. A single exctractor running on one 4 core VM can do the job in 50 hours. The same run limited to running 12 extractors finished in 4 hours. There is obviously an issue with the virtualization but no clear fix or workaround were aware of. Our objective is to extract and process all tick tapes in one shot. Response to Unless pegging the CPU causes the hypervisor to preempt the guest's synchronization primitives and cause priority inversion, in which case OP's throughput would plummet. If throughput remains reasonably stable in the VM, though, I would agree that a pegged CPU isn't necessarily a problem. Another possibility: the guest OS idle loop will probably count as CPU usage on the host, and with heavy load the guest won't necessarily attempt to park the cores it was using in between scheduling decisions; on bare metal, the spinning idle loop doesn't get reported as CPU usage, even though it is. Ryan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 100% CPU utilization sqlite running in VMWare
Yes: Inserts wrapped in transactions 10,000 at a shot. Tried pragma synchronous=OFF And 3 different threadsafe compilations for sqlite: # define SQLITE_THREADSAFE 0 /* IMP: R-07272-22309 */ # define SQLITE_THREADSAFE 1 /* IMP: R-07272-22309 */ # define SQLITE_THREADSAFE 2 /* IMP: R-07272-22309 */ All other settings are default. all 6 versions peg the cpu when i run concurrent instances = # of cores. Bare Metal wrote to local disk. Will test writing to same share next. VM's write to NAS (Windows server 2008 mapped network drive). Response to: Have you run your bare metal test using the same share? And have you tried pragma synchronous=OFF ?? And I assume you are using transactions? Have you profiled your app to see where it's spinning? Probably trying to get the lock on the share. What flags did you use to comiple the sqlite library? In particular for threadsafe? Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] 100% CPU utilization sqlite running in VMWare
Have a market data extractor running tick tapes and saving data to sqlite db. All coded in C++ and previously ran fine and fast on bare metal. Processes tick tapes from NAS and saved to local sqlite db's on c drive. We moved program onto VMWare VM's... same nas tick tapes. Writing db's to a share. On a 4 core vm we can run 3 extractors concurrently at 25 - 75% CPU utilization. The 4th pegs the CPU at 100% Same on an 8 core VM ... seems we can run Cores - 1 without pegging the CPU. The memory footprint for the program is a tiny 9mb... On bare metal we can run 20 extractors concurrently. Looking for suggestions on how to optimize for VMware. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Update Query
Keith, Trying to work through your suggestions: I'm getting Misuse of aggregate max() CREATE VIEW TICKMAX AS SELECT ASK, BID, TRADEPRICE, TIMESTAMP, SYMBOL FROM TICKDATA WHERE TIMESTAMP = MAX(TIMESTAMP) GROUP BY SYMBOL; ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Update Query
Not sure how to combine the subqueries. Our TickData table looks close to: TimeStamp, Symbol, Ask, Bid, Last ... IQ_A, IQ_B, IQ_T, IQ_X We need to update the IQ fields with calculations always made between the current record and previous symbol record ordered by timestamp. I don't know how to eliminate the 4 subqueries... Anyone have any code for implementing oracle style lag / lead analytic functions? This update query has become a monster... string functions, math calcs, sign checks, case, case etc. All very simple to do in Excel but not so easy as a Sql update query. The IQ fields hold arrays of values formatted into a string and are used to feed different downstream processes: ie. charting engine and price analysis. IQ_A = {1,-2,3,-19,25,-28,175,-228,359,-611,7105.5,-14165,21247.5;41019.5416927083,41019.5416741898,41019.5416241319,41019.5415546875,41019.5037196181,41019.4764363426,41019.939236,41018.6088674769,41016.6075604745,41009.610889,41002.4481603009,41002.395833912,41002.395833912} IQ_B = {1,-2,3,-19,25,-28,175,-228,359,-611,7105.5,-14163,21244.5;41019.5416927083,41019.5416741898,41019.5416241319,41019.5415546875,41019.5037196181,41019.4764363426,41019.939236,41018.6088674769,41016.6075604745,41009.610889,41002.4481603009,41002.395833912,41002.395833912} IQ_T {-1,3,-19,25,-29,175,-227.5,359.5,-611,7105,-14164,21246;41019.5416741898,41019.5416232639,41019.5415546875,41019.5038145255,41019.4764427083,41019.939236,41018.6088674769,41016.6077332176,41009.6107991898,41002.4481603009,41002.395833912,41002.395833912} IQ_X {39653,-13516,6737,-3974,-2650,-1874,-1376,-1084,-886,-726,-592,-522,-454,-376,-316,-302,-266,-248,-232,215;39667,-13512,6737,-3974,-2650,-1874,-1378,-1086,-886,-726,-592,-522,-454,-376,-316,-302,-266,-248,-232,215;-19456,9205,5157,-3240,-2234,-1618,-1258,-998,-808,-682,-564,-490,-412,-352,-312,-290,-262,-248,-230,209} I'm able to run the core calculation portions of the update query very fast. ie. 5 seconds to update 1 ... The string / array manipulations are another story. I may try a 2 pass approach: Pass 1 update query to record the rowid of the previous symbol record and pass 2 use a join for performing the IQ calcs. I think we might have hit the wall and need to code this externally. You appear to be doing the same sub-select four times. I don't know whether SQLite optimizes these into one or not, but if you're using a programming language I bet you could speed the query up a great deal by doing that subselect first, and substituting the results into the UPDATE command. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] (no subject)
How can I construct a update query to calculate and set a record field latency with the difference between timestamps by deviceid? Appears sqlite doesn't support lag and lead. ie. I have a table with 1,000,000 + records collecting real time stats from many devices with many columns but the main ones of interest are 'timestamp', 'latency' and 'DeviceID'. 2012-05-01 13:12:11.103 Null 14356 2012-05-01 13:12:11.103 Null 14372 2012-05-01 13:12:11.103 Null 4356 2012-05-01 13:12:07.103 Null 14356 2012-05-01 13:12:11.221 Null14356 The data is collected inconsistently... many times out of sequence I want to update the latency field for each record with the latest previous timestamp for the same DeviceID. Run this update query every time new data is appended to the table. Periodically recalc the entire table if an out of sequence file is processed or at minimum recalc from the earliest out of sequence time to present. I can do this fairly easily in excel but am struggling in sql/sqlite. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Update Query
Thanks for your suggestion Igor Tandetnik: Scope creep expanded the original query to the actual trading instruments and the refactored code has evolved to:: update TICKDATA set IQ_A = ROUND(ASK - ( select t2.ASK from TICKDATA t2 where t2.SYMBOL = TICKDATA.SYMBOL and t2.TIMESTAMP TICKDATA.TIMESTAMP ORDER BY T2.SYMBOL, t2.timestamp DESC LIMIT 1),4), IQ_B = ROUND(BID - ( select t2.BID from TICKDATA t2 where t2.SYMBOL = TICKDATA.SYMBOL and t2.TIMESTAMP TICKDATA.TIMESTAMP ORDER BY T2.SYMBOL, t2.timestamp DESC LIMIT 1),4), IQ_T = ROUND(TRADEPRICE - ( select t2.TRADEPRICE from TICKDATA t2 where t2.SYMBOL = TICKDATA.SYMBOL and t2.TIMESTAMP TICKDATA.TIMESTAMP ORDER BY T2.SYMBOL, t2.timestamp DESC LIMIT 1),4), IQ_X = (select t2.timestamp from TICKDATA t2 where t2.SYMBOL = TICKDATA.SYMBOL and t2.TIMESTAMP TICKDATA.TIMESTAMP ORDER BY T2.SYMBOL, t2.timestamp DESC LIMIT 1); The consolidated tick database is 1.5GB / 32M records and the above query takes 5 minutes to run. Found proper index is key for performance... Symbol,Timestamp ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users