Hi!

So, we're still havng troubles with one of our sqlite databases, or more 
likely one of our disks, but our system administrators cannot find 
anything wrong with it.

Perhaps someone reading this list has a clue what might be the case with 
this disk, or something else, when the sqlite client is stuck with the 
following stack traces (I made a couple of them after the query is 
already running for many minutes).

Disk is not busy but there is one process that is inserting records into 
the table that is at the same time queried by the sqlite client, a 
couple hundred thousand records per day.
(On other machines that insert even more records we don't see this 
happening and with the sqlite client there I can query whatever i want 
and query returns in a second and many times much faster. On all the 
machines the access pattern and configuration of the sqlite database is 
exactly the same)



The table definition is:
sqlite> .schema ordercallback
CREATE TABLE ordercallback (sessionuuid TEXT NOT NULL,cb_seq_num INTEGER 
NOT NULL,cb_uuid TEXT NOT NULL,server_order_id TEXT NOT NULL,product_id 
INTEGER NOT NULL,int_account INTEGER NOT NULL,ext_account INTEGER NOT 
NULL,client_order_id INTEGER NOT NULL,user_reference TEXT NOT 
NULL,user_exch_ref TEXT NOT NULL,buy_sell INTEGER NOT NULL,price INTEGER 
NOT NULL,stop_price INTEGER NOT NULL,size INTEGER NOT 
NULL,disclosed_size INTEGER NOT NULL,residual_size INTEGER NOT 
NULL,current_traded_size INTEGER NOT NULL,last_trade_price INTEGER NOT 
NULL,ext_trade_id TEXT NOT NULL,tot_traded_size INTEGER NOT 
NULL,imatch_traded_size INTEGER NOT NULL,avg_traded_price INTEGER NOT 
NULL,user_id INTEGER NOT NULL,trader_id INTEGER NOT NULL,status_time 
INTEGER NOT NULL,orig_time INTEGER NOT NULL,endstate INTEGER NOT 
NULL,order_type INTEGER NOT NULL,order_time_type INTEGER NOT 
NULL,order_price_type INTEGER NOT NULL,user_order_type INTEGER NOT 
NULL,expiry_date INTEGER NOT NULL,price_signs INTEGER NOT 
NULL,fill_flags INTEGER NOT NULL,specific_order_route TEXT NOT 
NULL,future_fields TEXT NOT NULL,reason TEXT NOT NULL,working INTEGER 
NOT NULL,FOREIGN KEY(sessionuuid) REFERENCES session(uuid) ON DELETE 
CASCADE);
CREATE INDEX ordercallback_index1 ON ordercallback (sessionuuid, 
int_account, cb_seq_num);
CREATE INDEX ordercallback_index2 ON ordercallback (sessionuuid, 
ext_account, cb_seq_num);
CREATE INDEX ordercallback_index3 ON ordercallback (server_order_id, 
sessionuuid, working);
CREATE INDEX ordercallback_index4 ON ordercallback (sessionuuid, 
endstate, working, int_account);
CREATE INDEX ordercallback_index5 ON ordercallback (sessionuuid, 
endstate, working, ext_account);
CREATE INDEX ordercallback_index6 ON ordercallback (cb_uuid);

On the machine where the query gets stuck there is more variation in 
values for the 'ext_account' column. Could that have to do with it?






Regards,
Gunnar


  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x00000034f020e530 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x000000000041a485 in seekAndRead ()
#2  0x000000000041a5f9 in unixRead ()
#3  0x0000000000410a86 in sqlite3OsRead ()
#4  0x0000000000422433 in readDbPage ()
#5  0x0000000000424e88 in sqlite3PagerAcquire ()
#6  0x000000000042c514 in btreeGetPage ()
#7  0x000000000042c60a in getAndInitPage ()
#8  0x000000000042fc25 in moveToChild ()
#9  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#10 0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#11 0x0000000000444e7d in sqlite3VdbeExec ()
#12 0x0000000000440163 in sqlite3Step ()
#13 0x000000000044034d in sqlite3_step ()
#14 0x0000000000405240 in shell_exec ()
#15 0x000000000040c4aa in process_input ()
#16 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x000000000043ec95 in sqlite3VdbeRecordCompare ()
#1  0x000000000043f51f in sqlite3VdbeIdxKeyCompare ()
#2  0x0000000000448db1 in sqlite3VdbeExec ()
#3  0x0000000000440163 in sqlite3Step ()
#4  0x000000000044034d in sqlite3_step ()
#5  0x0000000000405240 in shell_exec ()
#6  0x000000000040c4aa in process_input ()
#7  0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x00000034f020e530 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x000000000041a485 in seekAndRead ()
#2  0x000000000041a5f9 in unixRead ()
#3  0x0000000000410a86 in sqlite3OsRead ()
#4  0x0000000000422433 in readDbPage ()
#5  0x0000000000424e88 in sqlite3PagerAcquire ()
#6  0x000000000042c514 in btreeGetPage ()
#7  0x000000000042c60a in getAndInitPage ()
#8  0x000000000042fc25 in moveToChild ()
#9  0x000000000042fff2 in moveToLeftmost ()
#10 0x0000000000430b6d in sqlite3BtreeNext ()
#11 0x00000000004488d1 in sqlite3VdbeExec ()
#12 0x0000000000440163 in sqlite3Step ()
#13 0x000000000044034d in sqlite3_step ()
#14 0x0000000000405240 in shell_exec ()
#15 0x000000000040c4aa in process_input ()
#16 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x00000034f020ec50 in __lseek_nocancel () from /lib64/libpthread.so.0
#1  0x000000000041a42c in seekAndRead ()
#2  0x000000000041a5f9 in unixRead ()
#3  0x0000000000410a86 in sqlite3OsRead ()
#4  0x0000000000422433 in readDbPage ()
#5  0x0000000000424e88 in sqlite3PagerAcquire ()
#6  0x000000000042c514 in btreeGetPage ()
#7  0x000000000042c60a in getAndInitPage ()
#8  0x000000000042fc25 in moveToChild ()
#9  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#10 0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#11 0x0000000000444e7d in sqlite3VdbeExec ()
#12 0x0000000000440163 in sqlite3Step ()
#13 0x000000000044034d in sqlite3_step ()
#14 0x0000000000405240 in shell_exec ()
#15 0x000000000040c4aa in process_input ()
#16 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x000000000041df73 in pcacheUnpin ()
#1  0x000000000041e44f in sqlite3PcacheRelease ()
#2  0x0000000000424f41 in sqlite3PagerUnrefNotNull ()
#3  0x000000000042c667 in releasePage ()
#4  0x000000000042fdac in moveToRoot ()
#5  0x00000000004302fd in sqlite3BtreeMovetoUnpacked ()
#6  0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#7  0x0000000000444e7d in sqlite3VdbeExec ()
#8  0x0000000000440163 in sqlite3Step ()
#9  0x000000000044034d in sqlite3_step ()
#10 0x0000000000405240 in shell_exec ()
#11 0x000000000040c4aa in process_input ()
#12 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x00000034f020ec50 in __lseek_nocancel () from /lib64/libpthread.so.0
#1  0x000000000041a42c in seekAndRead ()
#2  0x000000000041a5f9 in unixRead ()
#3  0x0000000000410a86 in sqlite3OsRead ()
#4  0x0000000000422433 in readDbPage ()
#5  0x0000000000424e88 in sqlite3PagerAcquire ()
#6  0x000000000042c514 in btreeGetPage ()
#7  0x000000000042c60a in getAndInitPage ()
#8  0x000000000042fc25 in moveToChild ()
#9  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#10 0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#11 0x0000000000444e7d in sqlite3VdbeExec ()
#12 0x0000000000440163 in sqlite3Step ()
#13 0x000000000044034d in sqlite3_step ()
#14 0x0000000000405240 in shell_exec ()
#15 0x000000000040c4aa in process_input ()
#16 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
[hiq at hiqserver2 ~]$ pstack 96660
#0  0x00000034f020ec50 in __lseek_nocancel () from /lib64/libpthread.so.0
#1  0x000000000041a42c in seekAndRead ()
#2  0x000000000041a5f9 in unixRead ()
#3  0x0000000000410a86 in sqlite3OsRead ()
#4  0x0000000000422433 in readDbPage ()
#5  0x0000000000424e88 in sqlite3PagerAcquire ()
#6  0x000000000042c514 in btreeGetPage ()
#7  0x000000000042c60a in getAndInitPage ()
#8  0x000000000042fc25 in moveToChild ()
#9  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#10 0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#11 0x0000000000444e7d in sqlite3VdbeExec ()
#12 0x0000000000440163 in sqlite3Step ()
#13 0x000000000044034d in sqlite3_step ()
#14 0x0000000000405240 in shell_exec ()
#15 0x000000000040c4aa in process_input ()
#16 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x00000034f020e530 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x000000000041a485 in seekAndRead ()
#2  0x000000000041a5f9 in unixRead ()
#3  0x0000000000410a86 in sqlite3OsRead ()
#4  0x0000000000422433 in readDbPage ()
#5  0x0000000000424e88 in sqlite3PagerAcquire ()
#6  0x000000000042c514 in btreeGetPage ()
#7  0x000000000042c60a in getAndInitPage ()
#8  0x000000000042fc25 in moveToChild ()
#9  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#10 0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#11 0x0000000000444e7d in sqlite3VdbeExec ()
#12 0x0000000000440163 in sqlite3Step ()
#13 0x000000000044034d in sqlite3_step ()
#14 0x0000000000405240 in shell_exec ()
#15 0x000000000040c4aa in process_input ()
#16 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
[hiq at hiqserver2 ~]$ pstack 96660
#0  0x000000000042c505 in btreeGetPage ()
#1  0x000000000042c60a in getAndInitPage ()
#2  0x000000000042fc25 in moveToChild ()
#3  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#4  0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#5  0x0000000000444e7d in sqlite3VdbeExec ()
#6  0x0000000000440163 in sqlite3Step ()
#7  0x000000000044034d in sqlite3_step ()
#8  0x0000000000405240 in shell_exec ()
#9  0x000000000040c4aa in process_input ()
#10 0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x0000000000417a49 in sqlite3GetVarint ()
#1  0x000000000043045e in sqlite3BtreeMovetoUnpacked ()
#2  0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#3  0x0000000000444e7d in sqlite3VdbeExec ()
#4  0x0000000000440163 in sqlite3Step ()
#5  0x000000000044034d in sqlite3_step ()
#6  0x0000000000405240 in shell_exec ()
#7  0x000000000040c4aa in process_input ()
#8  0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
[hiq at hiqserver2 ~]$ pstack 96660
#0  0x0000000000430473 in sqlite3BtreeMovetoUnpacked ()
#1  0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#2  0x0000000000444e7d in sqlite3VdbeExec ()
#3  0x0000000000440163 in sqlite3Step ()
#4  0x000000000044034d in sqlite3_step ()
#5  0x0000000000405240 in shell_exec ()
#6  0x000000000040c4aa in process_input ()
#7  0x000000000040d573 in main ()
  [hiq at hiqserver2 ~]$
  [hiq at hiqserver2 ~]$ pstack 96660
#0  0x000000000041f5f6 in pcache1Fetch ()
#1  0x000000000041e1eb in sqlite3PcacheFetch ()
#2  0x0000000000424ce7 in sqlite3PagerAcquire ()
#3  0x000000000042c514 in btreeGetPage ()
#4  0x000000000042c60a in getAndInitPage ()
#5  0x000000000042fc25 in moveToChild ()
#6  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
#7  0x000000000043d65f in sqlite3VdbeCursorMoveto ()
#8  0x0000000000444e7d in sqlite3VdbeExec ()
#9  0x0000000000440163 in sqlite3Step ()
#10 0x000000000044034d in sqlite3_step ()
#11 0x0000000000405240 in shell_exec ()
#12 0x000000000040c4aa in process_input ()
#13 0x000000000040d573 in main ()

etc.



On 05/27/2015 05:42 PM, gunnar wrote:
> OK, so I executed query:
>
> sqlite> SELECT * FROM ordercallback cb WHERE 
> sessionuuid='42bda93e-a3a3-4fca-a5c8-dc3dcf2a663e' AND endstate=0 AND 
> working=1 AND cb_seq_num=(SELECT max(cb_seq_num) FROM ordercallback 
> WHERE server_order_id=cb.server_order_id AND 
> sessionuuid=cb.sessionuuid AND working=1);
>
> And it got 'stuck' after displaying a couple hundred records or so. 
> And the stack trace is then:
>
> #0  0x00000034f020ec50 in __lseek_nocancel () from /lib64/libpthread.so.0
> #1  0x000000000041a42c in seekAndRead ()
> #2  0x000000000041a5f9 in unixRead ()
> #3  0x0000000000410a86 in sqlite3OsRead ()
> #4  0x0000000000422433 in readDbPage ()
> #5  0x0000000000424e88 in sqlite3PagerAcquire ()
> #6  0x000000000042c514 in btreeGetPage ()
> #7  0x000000000042c60a in getAndInitPage ()
> #8  0x000000000042fc25 in moveToChild ()
> #9  0x0000000000430871 in sqlite3BtreeMovetoUnpacked ()
> #10 0x000000000043d65f in sqlite3VdbeCursorMoveto ()
> #11 0x0000000000444e7d in sqlite3VdbeExec ()
> #12 0x0000000000440163 in sqlite3Step ()
> #13 0x000000000044034d in sqlite3_step ()
> #14 0x0000000000405406 in shell_exec ()
> #15 0x000000000040c4aa in process_input ()
> #16 0x000000000040d573 in main ()
>
>
>
>
>
>
> On 05/27/2015 05:34 PM, gunnar wrote:
>> Tomorrow I'll also try to make a stack trace of sqlite_shell with 
>> pstack at the moment its stuck.
>>
>>
>>
>> Gunnar Harms
>>
>>
>> HiQ Invest
>> Rembrandt Tower ? 9th floor
>> Amstelplein 1
>> 1096 HA Amsterdam
>>
>>
>> On 05/27/2015 05:23 PM, gunnar wrote:
>>> Hello Simon,
>>>
>>> I'm not 100% sure that I understand what you mean. I remember I 
>>> tested that the sub query is the one that gets 'stuck' but I can't 
>>> remember how I found out (I'm already trying to remember how I found 
>>> it out and let you know once I recall it)
>>>
>>> I also found a very bad workaround:
>>>
>>> The original query that gets stuck once the process that inserts 
>>> into ordercallback table also runs (but doesn't get stuck when the 
>>> other process doesn't run):
>>>
>>> -- get the count of open orders
>>> SELECT count(*) FROM ordercallback cb WHERE sessionuuid=(SELECT uuid 
>>> FROM session WHERE date=(SELECT max(date) FROM session)) AND 
>>> endstate=0 AND working=1 AND cb_seq_num=(SELECT max(cb_seq_num) FROM 
>>> ordercallback WHERE server_order_id=cb.server_order_id AND 
>>> sessionuuid=cb.sessionuuid AND working=1);
>>>
>>>
>>> The following always work:
>>>
>>> create table aap as select * from ordercallback where 
>>> sessionuuid=(select uuid from session where date=(select max(date) 
>>> from session)) and endstate=0 and working=1;
>>> create table boe as select server_order_id, max(cb_seq_num) as 
>>> max_cb_seq_num from ordercallback where sessionuuid=(select uuid 
>>> from session where date=(select max(date) from session)) and 
>>> working=1 group by server_order_id;
>>>
>>> -- get the count of open orders
>>>
>>> select count(*) from aap, boe where 
>>> aap.server_order_id=boe.server_order_id and 
>>> aap.cb_seq_num=boe.max_cb_seq_num;
>>>
>>>
>>> But of course that is not what I want. I want it isolated, one 
>>> query, as it should be.
>>>
>>>
>>>
>>>
>>> We have several servers and each has its own sqlite database with an 
>>> ordercallback table in which they insert records. With only one of 
>>> them we have this issue, but that is also the one that inserts much 
>>> and much more records into its ordercallback table than the others.
>>>
>>>
>>>
>>> Regards,
>>> Gunnar
>>>
>>>
>>>
>>> On 05/27/2015 04:48 PM, Simon Slavin wrote:
>>>> On 27 May 2015, at 11:00am, gunnar <gharms at hiqinvest.nl> wrote:
>>>>
>>>>> SELECT *
>>>>> FROM ordercallback cb
>>>>> WHERE sessionuuid=@SESSIONUUID
>>>>> AND endstate=0
>>>>> AND working=1
>>>>> AND cb_seq_num=(
>>>>> SELECT max(cb_seq_num)
>>>>> FROM ordercallback
>>>>> WHERE server_order_id=cb.server_order_id
>>>>> AND sessionuuid=cb.sessionuuid AND working=1);
>>>> As Dr Hipp has pointed out, SQLite locks the entire database during 
>>>> the whole of your SELECT *, so any other process should not be 
>>>> interrupting your query once it has got started.
>>>>
>>>> Purely for research and not as a permanent solution, have you tried 
>>>> splitting your query up into two ? First do the subSELECT, 
>>>> returning what I think should be a single value, then do the main 
>>>> SELECT. You might be able to figure out which one is being 
>>>> interrupted.
>>>>
>>>> Simon.
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> sqlite-users at mailinglists.sqlite.org
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

Reply via email to