[sqlite] System.Data.SQlite Version 1.0.65 (3.7.7.1) much slower than 1.0.74 (3.6.16) !??
Hello I have been using System.Data.SQlite for over a year. Recently I downloaded the latest version, hoping to obtain better performance. Instead I was astonished to find that processing with the new version takes about TWICE THE TIME as with the old version. This is what my application does: 1. Creation phase: convert some input data into a database file containing several tables, using a succession of INSERT commands, all surrounded by a single BeginTransaction() / EndTransaction() pair. 2. Analysis phase: open an existing database file, and perform various SELECT operations on it. In both modes, 1.0.74 is significantly slower than 1.0.65. Setting the WAL option brings the Creation phase performance back to that of the old version. Trying further optimization options (PRAGMA synchronous=OFF; PRAGMA count_changes=OFF; PRAGMA temp_store=MEMORY as described in http://blog.quibb.org/2010/08/fast-bulk-inserts-into-sqlite) didn't help. Anybody have an idea ? Thanks ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN optimization
On 09/23/2011 04:01 AM, Mira Suk wrote: EXPLAIN QUERY PLAN SELECT DISTINCT ItemsME.Points FROM (ItemsME_Properties LEFT JOIN ItemsME ON ItemsME_Properties.IDR = ItemsME.IDR) WHERE ItemsME.IDR IN (SELECT IDR FROM cProds WHERE Prod = 106) selectidorderfromdetail 000SCAN TABLE ItemsME_Properties (~100 rows) 011SEARCH TABLE ItemsME USING INDEX IDR (IDR=?) (~2 rows) 000EXECUTE LIST SUBQUERY 1 100SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows) 000EXECUTE LIST SUBQUERY 2 200SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows) 000USE TEMP B-TREE FOR DISTINCT this takes about 3 seconds EXPLAIN QUERY PLAN SELECT DISTINCT ItemsME.Points FROM (ItemsME_Properties INNER JOIN ItemsME ON ItemsME_Properties.IDR = ItemsME.IDR) WHERE ItemsME.IDR IN (SELECT IDR FROM cProds WHERE Prod = 106) 001SEARCH TABLE ItemsME USING INDEX IDR (IDR=?) (~25 rows) 000EXECUTE LIST SUBQUERY 1 100SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows) 010SEARCH TABLE ItemsME_Properties USING COVERING INDEX IDR (IDR=?) (~1 rows) 000USE TEMP B-TREE FOR DISTINCT this runs in about 15 milisec = results are equal from both cases as ItemsME_Properties is temporary table (updated with triggers as reads are much more frequent) to workaround performance issues with search... (it should be 1:1 with ItemsME, just with less data, and some data reformatted) maybe I'm just stupid but I don't really see reason for scan in first case. Basically it is because with an inner join SQLite can reorder the tables in the join - A INNER JOIN B is the same as B INNER JOIN A. But it can't do that with left joins - as A LEFT JOIN B ON expr is not always the same as B LEFT JOIN A ON expr. And in this case SQLite needs to reorder tables ItemsME and ItemsME_Properties in order to avoid the table scan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Newbie starting off question
Thanks guys It is an embedded system so I have full control of everything. Due to memory size constraints the ideal thing for me would be to use a shared library. According to the first post, sharing a library between cgi modules and an application should not be a problem? David ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Newbie starting off question
On 23 Sep 2011, at 10:17am, david wrote: It is an embedded system so I have full control of everything. Due to memory size constraints the ideal thing for me would be to use a shared library. According to the first post, sharing a library between cgi modules and an application should not be a problem? Not a problem. If your platform allows it, and your compiler can make it happen, do it. Problems related to that stuff come down to multi-process and multi-thread oversights, not using the same library from many apps. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Newbie starting off question
On Fri, Sep 23, 2011 at 12:02 PM, Simon Slavin slav...@bigfraud.org wrote: Not a problem. If your platform allows it, and your compiler can make it happen, do it. Problems related to that stuff come down to multi-process and multi-thread oversights, not using the same library from many apps. A tiny expansion to that: someone recently reported a problem when serving a singleton db instance from a DLL (he was getting 2 different instances of the singleton, IIRC), but that's not an sqlite3-specific problem. -- - stephan beal http://wanderinghorse.net/home/stephan/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] c-api document suggestion
Dne 23.9.2011 4:41, Igor Tandetnik napsal(a): Mira Sukmira@centrum.cz wrote: On 9/21/2011 21:22 Igor Tandetnik wrote: You can include the NUL terminator, if you want it to actually be stored in the database. Actually you can't - if you do all SQL string functions will not work. to be clear - SELECT TRIM(what ever text column you stored with including null on end of string) will not trim that string. It does work - it removes all whitespace from the end of the string, up to but not including the first non-whitespace character. Which happens to be NUL. It follows the spec perfectly, it's your expectations that are wrong. Note that I didn't say it was wise to store NUL characters as part of the string - I only said that you could do it if you wanted to. sqlite3_bind_text takes the length parameter at face value, and stores exactly as many bytes as you tell it to store. It's up to you to ensure that the values actually make sense for your application. Garbage in/garbage out and all that. I just find entire nul handling in SQLite strange. it's C API why not expect C (that is nul terminated) strings. man says Strings returned by sqlite3_column_text() and sqlite3_column_text16(), even empty strings, are always zero terminated. The values returned by sqlite3_column_bytes() http://www.sqlite.org/c3ref/column_blob.html and sqlite3_column_bytes16() http://www.sqlite.org/c3ref/column_blob.html do not include the zero terminators at the end of the string. For clarity: the values returned by sqlite3_column_bytes() http://www.sqlite.org/c3ref/column_blob.html and sqlite3_column_bytes16() http://www.sqlite.org/c3ref/column_blob.html are the number of bytes in the string, not the number of characters. If I include my nul does this mean string is my nul terminated or sqlite appends another one ? is this auto-added null included in size returned ? or not ? are my nuls removed from string size or not ? (there is man obviously wrong as column_bytes returns size including zero terminators at the end of the string if I set them) = testing how API behaves is required to produce code due to unclear docs IMHO. As for non-functioning SQL text functions - David Garfield did some test. It's some time ago when I found out this does not work as expected, and yes, I most likely provided invalid sample. Fact is I wont ever store nul in SQLite database, as I don't trust all SQLite string functions will respect it. Maybe as You said, my expectations are wrong, however my expectations are based on other database engines. if one behaves differently, maybe it's not just my expectations. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] c-api document suggestion
On 23 Sep 2011, at 11:18am, Mirek Suk wrote: I just find entire nul handling in SQLite strange. it's C API why not expect C (that is nul terminated) strings. That's more or less the problem: C expects 0x00 termination. But SQLite3 is written to support UTF-8 and UTF-16 strings of specified lengths. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] NUL handling bugs (was Re: c-api document suggestion)
On Thu, Sep 22, 2011 at 7:53 PM, David Garfield garfi...@irving.iisd.sra.com wrote: As far as I am concerned, this is a SERIOUS bug in sqlite. SQLite does whatever you ask it to do. It makes no attempt to enforce good string hygiene. If you hand it well-formed strings, it gives the expected results and always returning well-formed strings. If, on the other hand, you give it malformed strings with embedded NULs and/or illegal UTF characters, SQLite will muddle through as best it can using whatever you gave it, without complaining, and without overflowing buffers or segfaulting or leaking memory or otherwise failing. This is an intentional feature, not a bug. When you have counted strings, which is all sqlite has (except for some API functions), NULs have no special meaning. This is fairly easily seen in the command line interface, where the code ignores the length and uses strlen forms. Testing various functions: hex(), relational operators, and group by get it right. max() and min() seem to get it right (but I may not have tested enough). length(), quote(), upper(), lower(), like(), and glob() get it wrong. group_concat() gets it wrong returning the result, but looks like it may have built the right thing internally. replace(), trim(), ltrim(), rtrim() get it right in the first parameter, but not in the second parameter. replace() gets it right in the third parameter. I'm not sure what else would be expected to allow NULs, so I didn't try numeric and date/time functions. Testing can be a bit of a pain. Things like: BEGIN TRANSACTION; CREATE TABLE t(a); insert into t values (cast(x'6F6E65202020' as text)); insert into t values (cast(x'6F6E6520202030' as text)); insert into t values (cast(x'6F6E6520202000' as text)); insert into t values (cast(x'6F6E6520202078' as text)); insert into t values (cast(x'6F6E650074776F20' as text)); insert into t values (cast(x'20006F6E6520' as text)); insert into t values (cast(x'00206F6E6520' as text)); COMMIT; .mode column .width 10 20 20 20 .header on select a, hex(a), hex(trim(a)), hex(trim(a,'one'||x'2000')) from t; Yields: a hex(a)hex(trim(a)) hex(trim(a,'one'||x' -- one 6F6E65202020 6F6E65 one 0 6F6E65202020306F6E652020203030 one 6F6E65202020006F6E652020200000 one x 6F6E65202020786F6E652020207878 one 6F6E650074776F20 6F6E650074776F007477 20006F6E6520 006F6E65 00 00206F6E6520 00206F6E6500 What I expect (including column screw-ups because NUL is zero width in xterm) is: a hex(a)hex(trim(a)) hex(trim(a,'one'||x' -- one 6F6E65202020 6F6E65 one 0 6F6E65202020306F6E652020203030 one6F6E65202020006F6E6520202000 one x 6F6E65202020786F6E652020207878 onetwo 6F6E650074776F20 6F6E650074776F7477 two 20006F6E6520 006F6E65 two 00206F6E6520 00206F6E65 Without the hex() calls, you can't even tell what worked and what didn't. --David Garfield Mira Suk writes: On 9/21/2011 21:22 Igor Tandetnik wrote: You can include the NUL terminator, if you want it to actually be stored in the database. Igor Tandetnik Actually you can't - if you do all SQL string functions will not work. to be clear - SELECT TRIM(what ever text column you stored with including null on end of string) will not trim that string. found out the hard way. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] undefined symbol: sqlite3_version.
I have copied the C programe in http://sqlite.org/quickstart.html and try to compile it. With BorlandC and TCC I obtain the unique same error: undefined symbol: sqlite3_version. SQLITE_API const char sqlite3_version[] = SQLITE_VERSION; is in line 693 and 110173 (ifndef SQLITE_AMALGAMATION which is defined in line 21). Suspecting it might be something related to the last version I looked in vain into the last two months list archives? Would someone now the answer to this? Thank you! jfc ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] DISTINCT on a JOIN
I have a setup which I will simplify as follows: There is a table of courses. Every course can have any number of people working on it. Every course involves any number of tasks. I want to make a SELECT which will return a table as follows: course1 number-of-people-involved-in-course1 number-of-tasks-involved-in-course1 course2 number-of-people-involved-in-course2 number-of-tasks-involved-in-course2 course3 number-of-people-involved-in-course3 number-of-tasks-involved-in-course3 course4 number-of-people-involved-in-course4 number-of-tasks-involved-in-course4 ... I can do one or the other but not both. Here's some text commands for you to play around with. CREATE TABLE courses (id INTEGER PRIMARY KEY,name TEXT); CREATE TABLE people (id INTEGER PRIMARY KEY,course INTEGER,name TEXT); CREATE TABLE tasks (id INTEGER PRIMARY KEY,course INTEGER,name TEXT); INSERT INTO courses VALUES (14,'starter course'); INSERT INTO people VALUES (21,14,'erica'); INSERT INTO people VALUES (22,14,'rod'); INSERT INTO tasks VALUES (31,14,'teaching'); INSERT INTO tasks VALUES (32,14,'marking'); sqlite SELECT courses.name,group_concat(people.name) FROM courses LEFT JOIN people ON people.course=courses.id GROUP BY courses.id; starter course|erica,rod sqlite SELECT courses.name,group_concat(tasks.name) FROM courses LEFT JOIN tasks ON tasks.course=courses.id GROUP BY courses.id; starter course|marking,teaching BUT ... sqlite SELECT courses.name,group_concat(people.name),group_concat(tasks.name) FROM courses LEFT JOIN people ON people.course=courses.id LEFT JOIN tasks ON tasks.course=courses.id GROUP BY courses.id; starter course|erica,erica,rod,rod|marking,teaching,marking,teaching sqlite SELECT courses.name,group_concat(people.name),group_concat(tasks.name) FROM courses LEFT JOIN people ON people.course=courses.id LEFT JOIN tasks ON tasks.course=courses.id GROUP BY courses.id,people.id,tasks.id; starter course|erica|teaching starter course|erica|marking starter course|rod|teaching starter course|rod|marking probably because ... sqlite SELECT * FROM courses LEFT JOIN people ON people.course=courses.id LEFT JOIN tasks ON tasks.course=courses.id; 14|starter course|21|14|erica|32|14|marking 14|starter course|21|14|erica|31|14|teaching 14|starter course|22|14|rod|32|14|marking 14|starter course|22|14|rod|31|14|teaching sqlite SELECT * FROM courses LEFT JOIN people ON people.course=courses.id LEFT JOIN tasks ON tasks.course=courses.id GROUP BY courses.id; 14|starter course|22|14|rod|31|14|teaching Is there some way of using DISTINCT or some other tactic which will make it list each related entry just the once ? Very grateful for any help. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DISTINCT on a JOIN
Simon Slavin slav...@bigfraud.org wrote: I have a setup which I will simplify as follows: There is a table of courses. Every course can have any number of people working on it. Every course involves any number of tasks. I want to make a SELECT which will return a table as follows: course1 number-of-people-involved-in-course1 number-of-tasks-involved-in-course1 course2 number-of-people-involved-in-course2 number-of-tasks-involved-in-course2 course3 number-of-people-involved-in-course3 number-of-tasks-involved-in-course3 course4 number-of-people-involved-in-course4 number-of-tasks-involved-in-course4 select name, (select count(*) from people where course=courses.id), (select count(*) from tasks where course=courses.id) from courses; -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Assertion in Sqlite 3.7.x on Cygwin using WAL
I'll get to my Windows PC at Sunday and I'll do it Thanks Artyom Beilis -- CppCMS - C++ Web Framework: http://cppcms.sf.net/ CppDB - C++ SQL Connectivity: http://cppcms.sf.net/sql/cppdb/ From: Richard Hipp d...@sqlite.org To: Artyom Beilis artyom...@yahoo.com; General Discussion of SQLite Database sqlite-users@sqlite.org Sent: Thursday, September 22, 2011 3:23 PM Subject: Re: [sqlite] Assertion in Sqlite 3.7.x on Cygwin using WAL 2011/9/21 Artyom Beilis artyom...@yahoo.com While building with -DSQLITE_DEBUG it reports an assertion: assertion isExclusive==0 || isCreate failed: file sqlite3.c, line 34058, function: winOpen (Version 3.7.8) Can you send us a stack trace at the point of the assert()? This does not happen on Windows builds of the same sources. I've attached the script and VDBE traces under cygwin and windows builds. Cygwin version: 1.7.9(0.237/5/3) Artyom Beilis -- CppCMS - C++ Web Framework: http://cppcms.sf.net/ CppDB - C++ SQL Connectivity: http://cppcms.sf.net/sql/cppdb/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] c-api document suggestion
Mirek Suk mira@centrum.cz wrote: I just find entire nul handling in SQLite strange. it's C API why not expect C (that is nul terminated) strings. Because some people do want to store strings with embedded NULs, for various reasons. If you don't, just pass -1 for length and be done with it. man says Strings returned by sqlite3_column_text() and sqlite3_column_text16(), even empty strings, are always zero terminated. The values returned by sqlite3_column_bytes() http://www.sqlite.org/c3ref/column_blob.html and sqlite3_column_bytes16() http://www.sqlite.org/c3ref/column_blob.html do not include the zero terminators at the end of the string. For clarity: the values returned by sqlite3_column_bytes() http://www.sqlite.org/c3ref/column_blob.html and sqlite3_column_bytes16() http://www.sqlite.org/c3ref/column_blob.html are the number of bytes in the string, not the number of characters. If I include my nul does this mean string is my nul terminated or sqlite appends another one ? SQLite appends another one. is this auto-added null included in size returned No. But your own (if any) is. are my nuls removed from string size or not ? No. You get back exactly the sequence of bytes you put in. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DISTINCT on a JOIN
On 23 Sep 2011, at 12:57pm, Igor Tandetnik wrote: select name, (select count(*) from people where course=courses.id), (select count(*) from tasks where course=courses.id) from courses; Thank you thank you thank you Igor. I must learn how to use sub-selects properly. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] c-api document suggestion
On 23 Sep 2011 at 11:18, Mirek Suk mira@centrum.cz wrote: Dne 23.9.2011 4:41, Igor Tandetnik napsal(a): Note that I didn't say it was wise to store NUL characters as part of the string - I only said that you could do it if you wanted to. sqlite3_bind_text takes the length parameter at face value, and stores exactly as many bytes as you tell it to store. It's up to you to ensure that the values actually make sense for your application. Garbage in/garbage out and all that. I just find entire nul handling in SQLite strange. it's C API why not expect C (that is nul terminated) strings. Not in my case it's not. In my case it's the PHP API and I would be very annoyed if strings came back with unexpected NULs at the end, or if any NUL I insert in the middle of a string acted as a terminator. -- Cheers -- Tim ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DISTINCT on a JOIN
course1 number-of-people-involved-in-course1 number-of-tasks-involved-in-course1 course2 number-of-people-involved-in-course2 number-of-tasks-involved-in-course2 course3 number-of-people-involved-in-course3 number-of-tasks-involved-in-course3 course4 number-of-people-involved-in-course4 number-of-tasks-involved-in-course4 This can be produced by: select crs.id , crs.name , (select count(id) from people pe where pe.course = crs.id) as num_people , (select count(id) from tasks ta where ta.course = crs.id)as num_tasks from courses crs ; Result: 14 starter course 2 2 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] select where date 30 days ago
Miklos, have you tried something like SELECT * FROM ticket WHERE time BETWEEN '2011-08-01' AND '2011-09-01'; ? I have my time as current_timestamp and gives me data like 2011-09-23 17:44:48, so using the above method works just fine with me. From: Miklos Koren miklos.ko...@gmail.com To: sqlite-users@sqlite.org Sent: Friday, September 16, 2011 3:01 PM Subject: [sqlite] select where date 30 days ago Hi, This is probably a basic SQL question, but I could not make it work under sqlite. How do I select records for which a datetime column is less than today - 30 days? I tried SELECT * FROM ticket WHERE time DATETIME('now','-30 days'); but it does not give the intended results. Thanks, Miklos ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] MC/DC coverage explained wrong in the home page?
Opinions vary on the exact meaning of MC/DC for a language (such as C) that has short-circuit boolean operators. You are advocating a more rigorous view of MC/DC that what I have heard before. This is not to say it is wrong, only different. For a decision of the form: if( A B )... The test suite for SQLite tries at least three cases: (1) A false (2) A true and B false (3) A true and B true You seem to be saying that this is only MC/DC if we split case (1) into two subcases: (1a) A false and B false (1b) A false and B true There are problems with this view, though. In many instances, B is undefined when A is false. In other words, if A is false, any attempt to calculate B will give undefined results - possibly a segfault. SQLite really does use the fact that is a short-circuit operator in C and so when A is false, it is technically illegal to make any conjectures about the value of B. Your argument is that there might be redundancy; that if B is always false when A is false, then it would be possible to simplify the decision to just: if( B )... True enough. But there are countless ways to refactor a boolean expression. I don't think that MC/DC has any requirement that the number of conditions be minimized, does it? Your objections would be understandable if SQLite where written in Pascal or Ada where AND and OR operators are not short-circuit and where the compiler is free to reorder them if it sees fit. But in C/C++ where the and || operators are short-circuit, and where the tests must occur in a well-defined order, things are different. It is as if the and || operators really marked boundaries between decisions, not conditions. But the | and operators used inside a decision are *not* short-circuit, and in those cases, your objections are valid. But those cases are rare, and we strive to demonstration independence of conditions there using testcase() macros. One of the key benefits we derive from testing SQLite to 100% branch coverage is that we end up testing the object code, not the source code. So even if the compiler makes a mistake, we will still likely catch it. I don't think your more rigorous definition of MC/DC is necessary to achieve that goal, is it? On Wed, Sep 21, 2011 at 4:36 PM, Sami Liedes slie...@cc.hut.fi wrote: Hi! Looking at http://www.sqlite.org/testing.html it seems to me that MC/DC coverage is explained wrong there. This in turn makes me wonder if SQLite tests really have 100% MC/DC coverage or if this claim is just based on mistaken understanding of MC/DC. The page explains: Wikipedia defines MC/DC as follows: * Each decision tries every possible outcome. * Each condition in a decision takes on every possible outcome. * Each entry and exit point is invoked. * Each condition in a decision is shown to independently affect the outcome of the decision. In the C programming language where and || are short-circuit operators, MC/DC and branch coverage are very nearly the same thing. The primary difference is in boolean vector tests. One can test for any of several bits in bit-vector and still obtain 100% branch test coverage even though the second element of MC/DC - the requirement that each condition in a decision take on every possible outcome - might not be satisfied. SQLite uses testcase() macros as described in the previous subsection to make sure that every condition in a bit-vector decision takes on every possible outcome. In this way, SQLite also achieves 100% MC/DC in addition to 100% branch coverage. I don't think the operative, clever thing in MC/DC is the second requirement, but the fourth one, and in it the important word is independently. Contrary to what seems to be claimed in the last paragraph, mak[ing] sure that every condition in a bit-vector decision takes on every possible outcome (the second requirement) expressly is *not* sufficient for MC/DC. In addition to that, you need to show that each condition in a decision [...] independently affect[s] the outcome of the decision (the fourth requirement). That is, if you have arbitrary boolean conditions A, B and C (not necessarily independent of each other), and you have a branch like if (A op B op C) the second condition means that you have to give test cases for the positive and negative of each A, B and C, and this is what seems to be explained by the page. That is, if you could provide only two test cases, one with A B C and the other with !A !B !C, that would satisfy the second condition. But the fourth condition requires something more: That each decision is shown to *independently* affect the outcome of the decision. That is, for each condition of A, B and C, you have to give two test cases where the difference in the test case both flips that *and only that* condition
Re: [sqlite] NUL handling bugs (was Re: c-api document suggestion)
But that is the point. Strings are generally defined in two ways. Either: 1) a pointer, and count every byte up to but not including a NUL. 2) a pointer and a length, and count every byte in the specified length. If you have a specified length, the length matters, and NULs do not. NUL is a perfectly valid character in such a string. Internally, SQLite uses counted strings. It should treat NUL as just one more character, since it is just one more character. If a USER decides to use a negative length in sqlite3_bind_text() and to use sqlite3_column_text() without using sqlite3_column_bytes(), then that user's database doesn't have NULs in the strings, and everything just works for him. For the database as a whole to ASSUME this breaks SQLite for users who do want NULs in their text data. Actually, I guess I am more concerned with the shell. I use blob columns that are partly text and would like to have the text portions at least be visible. I suspect that at least part of this problem came from SQLite's history as a TCL add-on. I suspect TCL, at least in it's early days, didn't allow NUL in a string. --David Garfield Richard Hipp writes: On Thu, Sep 22, 2011 at 7:53 PM, David Garfield garfi...@irving.iisd.sra.com wrote: As far as I am concerned, this is a SERIOUS bug in sqlite. SQLite does whatever you ask it to do. It makes no attempt to enforce good string hygiene. If you hand it well-formed strings, it gives the expected results and always returning well-formed strings. If, on the other hand, you give it malformed strings with embedded NULs and/or illegal UTF characters, SQLite will muddle through as best it can using whatever you gave it, without complaining, and without overflowing buffers or segfaulting or leaking memory or otherwise failing. This is an intentional feature, not a bug. When you have counted strings, which is all sqlite has (except for some API functions), NULs have no special meaning. This is fairly easily seen in the command line interface, where the code ignores the length and uses strlen forms. Testing various functions: hex(), relational operators, and group by get it right. max() and min() seem to get it right (but I may not have tested enough). length(), quote(), upper(), lower(), like(), and glob() get it wrong. group_concat() gets it wrong returning the result, but looks like it may have built the right thing internally. replace(), trim(), ltrim(), rtrim() get it right in the first parameter, but not in the second parameter. replace() gets it right in the third parameter. I'm not sure what else would be expected to allow NULs, so I didn't try numeric and date/time functions. Testing can be a bit of a pain. Things like: BEGIN TRANSACTION; CREATE TABLE t(a); insert into t values (cast(x'6F6E65202020' as text)); insert into t values (cast(x'6F6E6520202030' as text)); insert into t values (cast(x'6F6E6520202000' as text)); insert into t values (cast(x'6F6E6520202078' as text)); insert into t values (cast(x'6F6E650074776F20' as text)); insert into t values (cast(x'20006F6E6520' as text)); insert into t values (cast(x'00206F6E6520' as text)); COMMIT; .mode column .width 10 20 20 20 .header on select a, hex(a), hex(trim(a)), hex(trim(a,'one'||x'2000')) from t; Yields: a hex(a)hex(trim(a)) hex(trim(a,'one'||x' -- one 6F6E65202020 6F6E65 one 0 6F6E65202020306F6E652020203030 one 6F6E65202020006F6E652020200000 one x 6F6E65202020786F6E652020207878 one 6F6E650074776F20 6F6E650074776F007477 20006F6E6520 006F6E65 00 00206F6E6520 00206F6E6500 What I expect (including column screw-ups because NUL is zero width in xterm) is: a hex(a)hex(trim(a)) hex(trim(a,'one'||x' -- one 6F6E65202020 6F6E65 one 0 6F6E65202020306F6E652020203030 one6F6E65202020006F6E6520202000 one x 6F6E65202020786F6E652020207878 onetwo 6F6E650074776F20 6F6E650074776F7477 two 20006F6E6520 006F6E65 two 00206F6E6520 00206F6E65 Without the hex() calls, you can't even tell what worked and what didn't. --David Garfield Mira Suk writes: On 9/21/2011 21:22 Igor Tandetnik wrote: You can include the NUL terminator, if you want it to actually be stored in the database. Igor Tandetnik Actually you can't - if you do all SQL string functions will not work. to be clear - SELECT TRIM(what ever text column you stored with
Re: [sqlite] MC/DC coverage explained wrong in the home page?
On Fri, Sep 23, 2011 at 11:16:37AM -0400, Richard Hipp wrote: Opinions vary on the exact meaning of MC/DC for a language (such as C) that has short-circuit boolean operators. You are advocating a more rigorous view of MC/DC that what I have heard before. This is not to say it is wrong, only different. I think you are describing (plain) Condition/Decision Coverage, not Modified Condition/Decision Coverage (see below). For a decision of the form: if( A B )... The test suite for SQLite tries at least three cases: (1) A false (2) A true and B false (3) A true and B true You seem to be saying that this is only MC/DC if we split case (1) into two subcases: (1a) A false and B false (1b) A false and B true Hmm, no, not quite, although you do have a point here that in shortcutting languages it's different. I would concede that in a shortcutting language you do not need to test for (1a) and (1b). I think it means that for the decision of the form if( A B ), you additionally need to show that A and B both can independently affect the outcome; that is, For A: Either (2) or (3) results in different branch being taken from (1) (which in this case would be true for (3), but the analysis is not always so straightforward) For B: The branches taken in (2) and (3) are different (which is the case here). So in case of if( A B ), I think you are correct that these three test cases suffice for MC/DC. But you *have* to consider the outcome too for it to be MC/DC. Taking a look at the Wikipedia page[1] for MC/DC, I believe what you describe is plain Condition/Decision Coverage, not MC/DC: * Condition/Decision Coverage Every point of entry and exit in the program has been invoked at least once, every condition in a decision in the program has taken all possible outcomes at least once, and every decision in the program has taken all possible outcomes at least once. * Modified Condition/Decision Coverage Every point of entry and exit in the program has been invoked at least once, every condition in a decision in the program has taken on all possible outcomes at least once, and each condition has been shown to affect that decision outcome independently. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions. The condition/decision criterion does not guarantee the coverage of all conditions in the module because in many test cases, some conditions of a decision are masked by the other conditions. Using the modified condition/decision criterion, each condition must be shown to be able to act on the decision outcome by itself, everything else being held fixed. The MC/DC criterion is thus much stronger than the condition/decision coverage. Your argument is that there might be redundancy; that if B is always false when A is false, then it would be possible to simplify the decision to just: if( B )... True enough. But there are countless ways to refactor a boolean expression. I don't think that MC/DC has any requirement that the number of conditions be minimized, does it? I think that's one of the central ideas of MC/DC. Well, I think it doesn't require *minimizing* per se, only that each condition independently is enough to flip the outcome. (I now think I was wrong to talk about subexpressions in my original post; The conditions are the atomic conditions, not subexpressions.) I certainly cannot imagine of another plausible reading of the fourth condition. Perhaps you have a different plausible interpretation? In fact nearly all definitions of MC/DC I can find contain this paragraph, which explains it quite well: A condition is shown to affect a decision's outcome independently by varying just that decision while holding fixed all other possible conditions. One of the key benefits we derive from testing SQLite to 100% branch coverage is that we end up testing the object code, not the source code. So even if the compiler makes a mistake, we will still likely catch it. I don't think your more rigorous definition of MC/DC is necessary to achieve that goal, is it? I certainly think that even 100% branch coverage (also called Decision Coverage) is quite impressive in itself without MC/DC, and if what the SQLite test harness does is plain Condition/Decision Coverage, I'm sure that already places SQLite within the best tested 1% of software. I'm happy and impressed with how tested it is, I just don't think it's MC/DC :) As far as I can tell 100% MC/DC coverage is only required in highly safety critical systems, most notably the DO-178B standard for avionics for the most
Re: [sqlite] MC/DC coverage explained wrong in the home page?
On Fri, Sep 23, 2011 at 1:51 PM, Sami Liedes slie...@cc.hut.fi wrote: For a decision of the form: if( A B )... The test suite for SQLite tries at least three cases: (1) A false (2) A true and B false (3) A true and B true You seem to be saying that this is only MC/DC if we split case (1) into two subcases: (1a) A false and B false (1b) A false and B true Hmm, no, not quite, although you do have a point here that in shortcutting languages it's different. I would concede that in a shortcutting language you do not need to test for (1a) and (1b). I think it means that for the decision of the form if( A B ), you additionally need to show that A and B both can independently affect the outcome; For case (1), since B is uncomputable, we can deem it to be true. Then since case (3) has a different outcome from cases (1) and (2), we do show that each term is independent of the other. The fact that B is deemed true in case (1) might raise eyebrows, but in a shortcutting language, that's the best you can do, I think. that is, For A: Either (2) or (3) results in different branch being taken from (1) (which in this case would be true for (3), but the analysis is not always so straightforward) For B: The branches taken in (2) and (3) are different (which is the case here). So in case of if( A B ), I think you are correct that these three test cases suffice for MC/DC. But you *have* to consider the outcome too for it to be MC/DC. Taking a look at the Wikipedia page[1] for MC/DC, I believe what you describe is plain Condition/Decision Coverage, not MC/DC: * Condition/Decision Coverage Every point of entry and exit in the program has been invoked at least once, every condition in a decision in the program has taken all possible outcomes at least once, and every decision in the program has taken all possible outcomes at least once. * Modified Condition/Decision Coverage Every point of entry and exit in the program has been invoked at least once, every condition in a decision in the program has taken on all possible outcomes at least once, and each condition has been shown to affect that decision outcome independently. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions. The condition/decision criterion does not guarantee the coverage of all conditions in the module because in many test cases, some conditions of a decision are masked by the other conditions. Using the modified condition/decision criterion, each condition must be shown to be able to act on the decision outcome by itself, everything else being held fixed. The MC/DC criterion is thus much stronger than the condition/decision coverage. Your argument is that there might be redundancy; that if B is always false when A is false, then it would be possible to simplify the decision to just: if( B )... True enough. But there are countless ways to refactor a boolean expression. I don't think that MC/DC has any requirement that the number of conditions be minimized, does it? I think that's one of the central ideas of MC/DC. Well, I think it doesn't require *minimizing* per se, only that each condition independently is enough to flip the outcome. (I now think I was wrong to talk about subexpressions in my original post; The conditions are the atomic conditions, not subexpressions.) I certainly cannot imagine of another plausible reading of the fourth condition. Perhaps you have a different plausible interpretation? In fact nearly all definitions of MC/DC I can find contain this paragraph, which explains it quite well: A condition is shown to affect a decision's outcome independently by varying just that decision while holding fixed all other possible conditions. One of the key benefits we derive from testing SQLite to 100% branch coverage is that we end up testing the object code, not the source code. So even if the compiler makes a mistake, we will still likely catch it. I don't think your more rigorous definition of MC/DC is necessary to achieve that goal, is it? I certainly think that even 100% branch coverage (also called Decision Coverage) is quite impressive in itself without MC/DC, and if what the SQLite test harness does is plain Condition/Decision Coverage, I'm sure that already places SQLite within the best tested 1% of software. I'm happy and impressed with how tested it is, I just don't think it's MC/DC :) As far as I can tell 100% MC/DC coverage is only required in highly safety critical systems, most notably the DO-178B
Re: [sqlite] MC/DC coverage explained wrong in the home page?
[Note: In case my explanations are not clear enough, there's a fairly formal position paper by Certification Authorities Software Team (CAST-10) clarifying MC/DC here: http://www.faa.gov/aircraft/air_cert/design_approvals/air_software/cast/cast_papers/media/cast-10.pdf ] On Fri, Sep 23, 2011 at 02:05:40PM -0400, Richard Hipp wrote: For case (1), since B is uncomputable, we can deem it to be true. Then since case (3) has a different outcome from cases (1) and (2), we do show that each term is independent of the other. The fact that B is deemed true in case (1) might raise eyebrows, but in a shortcutting language, that's the best you can do, I think. Yes, that reasoning makes sense. But even allowing for that doesn't in all cases satisfy the fourth MC/DC criterion. It does for the simple (A B) case, but consider a more complex expression, ((A B) || (C D)): Now the truth table (with _ as possibly uncomputable) would be A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (3) 0 _ 1 1 T (4) 1 0 0 _ F (5) 1 0 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T So using your approach, that is the plain Condition/Decision Coverage, I believe these test cases would suffice: A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T This satisfies all three plain C/DC criteria: (a) Every point of entry and exit in the program has been invoked at least once -- does not apply to this if statement (b) Every condition in a decision in the program has taken all possible outcomes at least once -- A is tested by (1,6), B by (6,7), C by (1,2), D by (2,6) (c) every decision in the program has taken all possible outcomes at least once -- false branch taken in (1), false branch in (6) But this is not sufficient for the fourth criterion of MC/DC (quoting from Wikipedia): (d) Each condition has been shown to affect that decision outcome independently. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions. This criterion is satisfied for only for condition A (by 1;7), but not for B (6;7 would if the branches taken were different), C (1;2 would if the branches taken were different) or D. Note that the affects outcome requirement really forces us to consider the branch taken alongside with the conditions taken. Short-circuiting operators are really mostly an orthogonal concern to this. You simply cannot do MC/DC analysis without considering the branch taken in conjunction with the values taken by the conditions. Sami ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] MC/DC coverage explained wrong in the home page?
Richard Hipp wrote: Opinions vary on the exact meaning of MC/DC for a language (such as C) that has short-circuit boolean operators. snip There are problems with this view, though. In many instances, B is undefined when A is false. In other words, if A is false, any attempt to calculate B will give undefined results - possibly a segfault. SQLite really does use the fact that is a short-circuit operator in C and so when A is false, it is technically illegal to make any conjectures about the value of B. snip Your objections would be understandable if SQLite where written in Pascal or Ada where AND and OR operators are not short-circuit and where the compiler is free to reorder them if it sees fit. But in C/C++ where the and || operators are short-circuit, and where the tests must occur in a well-defined order, things are different. It is as if the and || operators really marked boundaries between decisions, not conditions. But the | and operators used inside a decision are *not* short-circuit, and in those cases, your objections are valid. snip This is why I think it is valuable for a programming language to provide multiple versions of some operations such as boolean and,or where one variant doesn't short-circuit and the other does. The primary purpose, then, of short-circuiting operators, is *not* about performance but rather about validity; they would be used in situations where the value of one operand determines whether it is even valid to calculate the other operand, such as if the first operand is a type check and the second is an operation that is only valid for some types. In fact, for a computationally complete language that is functional, I would argue that short-circuiting logic operators is essential. The non-short-circuiting boolean operators would be for all other uses, where the validity of one argument doesn't depend on the values of any of the other arguments, and so the compiler can be free to reorder it. -- Darren Duncan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] NUL handling bugs (was Re: c-api document suggestion)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/23/2011 08:47 AM, David Garfield wrote: But that is the point. Strings are generally defined in two ways. Either: 1) a pointer, and count every byte up to but not including a NUL. 2) a pointer and a length, and count every byte in the specified length. You insist on a dichotomy but SQLite supports both conventions simultaneously. You'll only have some difficulty if you use the second convention as some builtin functions operate on the first convention, but your own user defined functions/collations etc can do the right thing. Internally, SQLite uses counted strings. It should treat NUL as just one more character, since it is just one more character. It does with the exception of some user defined functions. ... user's database ... The same database can be populated by many different programs, and you should really distinguish between users and developers - sometimes they are the same person and sometimes not. Eg whose database is the one used by Chrome and Firefox? Actually, I guess I am more concerned with the shell. The shell is not a formal part of SQLite. It doesn't have the same level of testing or backwards/forwards compatibility requirements. It is still completely open code with no restrictions on how it operates so you can modify the code to do whatever you want. I use blob columns that are partly text and would like to have the text portions at least be visible. You cannot display bytes as text unless you know the encoding. There is nothing stopping you from altering the shell to do this, adding a hexdump output mode, adding a user defined function etc. The SQLite shell isn't particular well structured for easy developer extension. My Python SQLite wrapper includes a compatible shell that is easy to augment and extend: http://apidoc.apsw.googlecode.com/hg/shell.html Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk59DPMACgkQmOOfHg372QRh4gCdE+pNpGL7GG3FD9YVA161JdZj gYYAn2lrpRpKsNerKWOv6o47EH0AL/Jf =BPPC -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] MC/DC coverage explained wrong in the home page?
On 23 Sep 2011, at 11:30pm, Darren Duncan wrote: This is why I think it is valuable for a programming language to provide multiple versions of some operations such as boolean and,or where one variant doesn't short-circuit and the other does. The primary purpose, then, of short-circuiting operators, is *not* about performance but rather about validity; they would be used in situations where the value of one operand determines whether it is even valid to calculate the other operand, such as if the first operand is a type check and the second is an operation that is only valid for some types. In fact, for a computationally complete language that is functional, I would argue that short-circuiting logic operators is essential. Except that this is going to get rarer as languages arrive which encode multi-processing as part of the compilation process. One-by-one evaluation is going to vanish. If you have three processors free, why not evaluate three arguments at once, even if it turns out that the value of one of them means the other two don't matter ? We're going to find applications going down many 'false paths' in the process of arriving at a response. If you want your arguments evaluated, evaluated them yourself. If the order matters, evaluate them in that order. Don't rely on it as an undocumented side-effect of something else. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] NUL handling bugs (was Re: c-api document suggestion)
Roger Binns writes: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/23/2011 08:47 AM, David Garfield wrote: But that is the point. Strings are generally defined in two ways. Either: 1) a pointer, and count every byte up to but not including a NUL. 2) a pointer and a length, and count every byte in the specified length. You insist on a dichotomy but SQLite supports both conventions simultaneously. You'll only have some difficulty if you use the second convention as some builtin functions operate on the first convention, but your own user defined functions/collations etc can do the right thing. I insist on the dichotomy because the dichotomy is real. Think about the semantics of honoring NUL as end of string in the data stored by sqlite in the file. And there is a third hybrid model (NUL or count) that is so little used that most people don't recognize it when it is placed in front of them. SQLite's API supports both (mostly). Internally, you must use one or the other (or hideously duplicate code), and SQLite uses the second -- except for some functions (which use the hybrid model). That exception is the bug. Internally, SQLite uses counted strings. It should treat NUL as just one more character, since it is just one more character. It does with the exception of some user defined functions. Correction: with the exception of a number of BUILT IN functions. sqlite3_value_*() and sqlite3_result_*() are fully capable of using the counted model, and have the same shortcuts for the NUL-terminated model. ... user's database ... The same database can be populated by many different programs, and you should really distinguish between users and developers - sometimes they are the same person and sometimes not. Eg whose database is the one used by Chrome and Firefox? In this case, by user I meant the user of SQLite API, which is in fact a developer. Actually, I guess I am more concerned with the shell. The shell is not a formal part of SQLite. It doesn't have the same level of testing or backwards/forwards compatibility requirements. It is still completely open code with no restrictions on how it operates so you can modify the code to do whatever you want. I use blob columns that are partly text and would like to have the text portions at least be visible. You cannot display bytes as text unless you know the encoding. There is nothing stopping you from altering the shell to do this, adding a hexdump output mode, adding a user defined function etc. Of course, the SQLite shell does it anyway. So cannot is not really correct. The SQLite shell isn't particular well structured for easy developer extension. I've seen that... ouch. My Python SQLite wrapper includes a compatible shell that is easy to augment and extend: And your python wrapper is probably implemented using the counted string form exclusively. :-) http://apidoc.apsw.googlecode.com/hg/shell.html Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk59DPMACgkQmOOfHg372QRh4gCdE+pNpGL7GG3FD9YVA161JdZj gYYAn2lrpRpKsNerKWOv6o47EH0AL/Jf =BPPC -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] MC/DC coverage explained wrong in the home page?
On Fri, Sep 23, 2011 at 5:57 PM, Sami Liedes slie...@cc.hut.fi wrote: [Note: In case my explanations are not clear enough, there's a fairly formal position paper by Certification Authorities Software Team (CAST-10) clarifying MC/DC here: http://www.faa.gov/aircraft/air_cert/design_approvals/air_software/cast/cast_papers/media/cast-10.pdf ] Quoting from the above paper: -- 1. Structural coverage guidelines are: a) Every statement in the program has been invoked at least once; b) Every point of entry and exit in the program has been invoked at least once; c) Every control statement (i.e., branchpoint) in the program has taken all possible outcomes (i.e., branches) at least once; d) Every non-constant Boolean expression in the program has evaluated to both a True and a False result; e) Every non-constant condition in a Boolean expression in the program has evaluated to both a True and a False result; f) Every non-constant condition in a Boolean expression in the program has been shown to independently affect that expression's outcome. 2. Based upon these definitions: • Statement Coverage requires (a) only • DC requires (b, c, d) • MC/DC requires (b, c, d, e, f) -- The object-code branch coverage testing of SQLite clearly fulfills guidelines a, b, c, d, and e. The question at hand is does 100% branch coverage fulfill f in a short-circuit language such as C/C++. Note that the paper above completely ignores this issue. It is as if the authors had never heard of short-circuit evaluation. Or, perhaps they are familiar with the problem but could not reach agreement on its solution so simply didn't bring it up. (NB: The paper above uses branch in the context of source code. A source code branch might involve multiple conditions. When SQLite claims 100% branch test coverage, it is using branch in the context of object code, where there is exactly one branch per condition. It is important to distinguish between these two entirely different meanings of the word branch.) My claim is that in a short-circuit language, guideline e implies guideline f. In other words, if all boolean operators are short-circuited, then obtaining e automatically means that you also obtain f. But who knows what the FAA really wants. I've discussed this problem informally with DERs but I have never put SQLite through a formal review. If and when that happens, and if guideline f becomes an issue, I may have to add additional test cases. But for now, I think I will continue to claim MC/DC coverage. On Fri, Sep 23, 2011 at 02:05:40PM -0400, Richard Hipp wrote: For case (1), since B is uncomputable, we can deem it to be true. Then since case (3) has a different outcome from cases (1) and (2), we do show that each term is independent of the other. The fact that B is deemed true in case (1) might raise eyebrows, but in a shortcutting language, that's the best you can do, I think. Yes, that reasoning makes sense. But even allowing for that doesn't in all cases satisfy the fourth MC/DC criterion. It does for the simple (A B) case, but consider a more complex expression, ((A B) || (C D)): Now the truth table (with _ as possibly uncomputable) would be A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (3) 0 _ 1 1 T (4) 1 0 0 _ F (5) 1 0 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T So using your approach, that is the plain Condition/Decision Coverage, I believe these test cases would suffice: A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T This satisfies all three plain C/DC criteria: (a) Every point of entry and exit in the program has been invoked at least once -- does not apply to this if statement (b) Every condition in a decision in the program has taken all possible outcomes at least once -- A is tested by (1,6), B by (6,7), C by (1,2), D by (2,6) (c) every decision in the program has taken all possible outcomes at least once -- false branch taken in (1), false branch in (6) But this is not sufficient for the fourth criterion of MC/DC (quoting from Wikipedia): (d) Each condition has been shown to affect that decision outcome independently. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions. This criterion is satisfied for only for condition A (by 1;7), but not for B (6;7 would if the branches taken were different), C (1;2 would if the branches taken were different) or D. Note that the affects outcome requirement really forces us to consider the branch taken alongside with the conditions taken. Short-circuiting operators are really mostly an orthogonal concern to this. You simply cannot do MC/DC analysis without considering the branch taken in conjunction with
Re: [sqlite] MC/DC coverage explained wrong in the home page?
On Fri, Sep 23, 2011 at 09:17:43PM -0400, Richard Hipp wrote: -- 1. Structural coverage guidelines are: a) Every statement in the program has been invoked at least once; b) Every point of entry and exit in the program has been invoked at least once; c) Every control statement (i.e., branchpoint) in the program has taken all possible outcomes (i.e., branches) at least once; d) Every non-constant Boolean expression in the program has evaluated to both a True and a False result; e) Every non-constant condition in a Boolean expression in the program has evaluated to both a True and a False result; f) Every non-constant condition in a Boolean expression in the program has been shown to independently affect that expression's outcome. 2. Based upon these definitions: • Statement Coverage requires (a) only • DC requires (b, c, d) • MC/DC requires (b, c, d, e, f) -- [...] My claim is that in a short-circuit language, guideline e implies guideline f. In other words, if all boolean operators are short-circuited, then obtaining e automatically means that you also obtain f. But certainly (e) alone (without (c)) cannot imply (f). A simple counterexample: if (A || 1) ... You can get (e) by giving test cases for A and !A, but most certainly flipping A does not independently affect the outcome as required by the plain reading of (f). Furthermore, I thought I just disproved the very claim that (b,c,d,e) implies (f), by giving a counterexample where (b,c,d,e) are satisfied but some of the conditions (namely B, C and D below) are *not* shown to independently affect the outcome even where they are evaluated. :-) The counterexample is quoted below. Sami Yes, that reasoning makes sense. But even allowing for that doesn't in all cases satisfy the fourth MC/DC criterion. It does for the simple (A B) case, but consider a more complex expression, ((A B) || (C D)): Now the truth table (with _ as possibly uncomputable) would be A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (3) 0 _ 1 1 T (4) 1 0 0 _ F (5) 1 0 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T So using your approach, that is the plain Condition/Decision Coverage, I believe these test cases would suffice: A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T This satisfies all three plain C/DC criteria: (a) Every point of entry and exit in the program has been invoked at least once -- does not apply to this if statement (b) Every condition in a decision in the program has taken all possible outcomes at least once -- A is tested by (1,6), B by (6,7), C by (1,2), D by (2,6) (c) every decision in the program has taken all possible outcomes at least once -- false branch taken in (1), false branch in (6) But this is not sufficient for the fourth criterion of MC/DC (quoting from Wikipedia): (d) Each condition has been shown to affect that decision outcome independently. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions. This criterion is satisfied for only for condition A (by 1;7), but not for B (6;7 would if the branches taken were different), C (1;2 would if the branches taken were different) or D. Note that the affects outcome requirement really forces us to consider the branch taken alongside with the conditions taken. Short-circuiting operators are really mostly an orthogonal concern to this. You simply cannot do MC/DC analysis without considering the branch taken in conjunction with the values taken by the conditions. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] MC/DC coverage explained wrong in the home page?
if (A || 1) ... You can get (e) by giving test cases for A and !A, but most certainly flipping A does not independently affect the outcome as required by the plain reading of (f). I'm pretty sure that the latest versions of modern compilers will optimize the above if statement to the following: A; // everything inside if They won't even check the outcome of A and even won't calculate any part of A that has no side effects (and it's known at compile time). So in an object file A is not a condition and can hardly be called a boolean expression. Does (d), (e) and (f) even apply to it? Pavel On Fri, Sep 23, 2011 at 9:58 PM, Sami Liedes slie...@cc.hut.fi wrote: On Fri, Sep 23, 2011 at 09:17:43PM -0400, Richard Hipp wrote: -- 1. Structural coverage guidelines are: a) Every statement in the program has been invoked at least once; b) Every point of entry and exit in the program has been invoked at least once; c) Every control statement (i.e., branchpoint) in the program has taken all possible outcomes (i.e., branches) at least once; d) Every non-constant Boolean expression in the program has evaluated to both a True and a False result; e) Every non-constant condition in a Boolean expression in the program has evaluated to both a True and a False result; f) Every non-constant condition in a Boolean expression in the program has been shown to independently affect that expression's outcome. 2. Based upon these definitions: • Statement Coverage requires (a) only • DC requires (b, c, d) • MC/DC requires (b, c, d, e, f) -- [...] My claim is that in a short-circuit language, guideline e implies guideline f. In other words, if all boolean operators are short-circuited, then obtaining e automatically means that you also obtain f. But certainly (e) alone (without (c)) cannot imply (f). A simple counterexample: if (A || 1) ... You can get (e) by giving test cases for A and !A, but most certainly flipping A does not independently affect the outcome as required by the plain reading of (f). Furthermore, I thought I just disproved the very claim that (b,c,d,e) implies (f), by giving a counterexample where (b,c,d,e) are satisfied but some of the conditions (namely B, C and D below) are *not* shown to independently affect the outcome even where they are evaluated. :-) The counterexample is quoted below. Sami Yes, that reasoning makes sense. But even allowing for that doesn't in all cases satisfy the fourth MC/DC criterion. It does for the simple (A B) case, but consider a more complex expression, ((A B) || (C D)): Now the truth table (with _ as possibly uncomputable) would be A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (3) 0 _ 1 1 T (4) 1 0 0 _ F (5) 1 0 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T So using your approach, that is the plain Condition/Decision Coverage, I believe these test cases would suffice: A B C D branch taken (1) 0 _ 0 _ F (2) 0 _ 1 0 F (6) 1 0 1 1 T (7) 1 1 _ _ T This satisfies all three plain C/DC criteria: (a) Every point of entry and exit in the program has been invoked at least once -- does not apply to this if statement (b) Every condition in a decision in the program has taken all possible outcomes at least once -- A is tested by (1,6), B by (6,7), C by (1,2), D by (2,6) (c) every decision in the program has taken all possible outcomes at least once -- false branch taken in (1), false branch in (6) But this is not sufficient for the fourth criterion of MC/DC (quoting from Wikipedia): (d) Each condition has been shown to affect that decision outcome independently. A condition is shown to affect a decision’s outcome independently by varying just that condition while holding fixed all other possible conditions. This criterion is satisfied for only for condition A (by 1;7), but not for B (6;7 would if the branches taken were different), C (1;2 would if the branches taken were different) or D. Note that the affects outcome requirement really forces us to consider the branch taken alongside with the conditions taken. Short-circuiting operators are really mostly an orthogonal concern to this. You simply cannot do MC/DC analysis without considering the branch taken in conjunction with the values taken by the conditions. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] MC/DC coverage explained wrong in the home page?
On Sep 23, 2011, at 9:17 PM, Richard Hipp wrote: paper above completely ignores this issue. It is as if the authors had never heard of short-circuit evaluation. Or, perhaps they are familiar with the problem but could not reach agreement on its solution so simply didn't bring it up. Another way to look at a short-circuit evaluation is that it does not represent a Boolean expression at all. It represents a control statement. A B = { if A then return B else return true } A || B = { if A then return true else return B } e ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users