Hi, I am using MySQL 5.0.51a. I've got a problem with a stored function. It reads as follows:
CREATE FUNCTION `_contractRoot`(temp INT) RETURNS int(11) READS SQL DATA BEGIN DECLARE _parent_id INT; REPEAT SET _parent_id = temp; SELECT parent_id INTO temp FROM contract WHERE id = _parent_id; UNTIL temp IS NULL END REPEAT; RETURN _parent_id; END
There is a table "contract" containing row groups that form a tree. In the table, the columns "id", "parent_id" and "number" are defined. Each tree root has set the NULL value for the "parent_id" column, while the child rows have references to the ID of another row, and by following these references from any child, the root ID will be calculated by this function. Additionaly, there is a value set in the "number" column if parent_id IS NOT NULL, i.e. the root row has a value set, while the other rows have number = NULL. This maps each tree to a number. (There are 1.500 rows in the "contract" table.) Now, let's look at these queries: mysql> SELECT _contractRoot(320); +--------------------+ | _contractRoot(320) | +--------------------+ | 317 | +--------------------+ 1 row in set (0.00 sec) mysql> SELECT number FROM contract WHERE id = _contractRoot(320); +--------+ | number | +--------+ | 93 | +--------+ 1 row in set (0.06 sec) As you see, _contractRoot(320) is run in a very short period of time, while the second statement seems to run the _contractRoot function for each line that is processed by the WHERE clause, and therefore takes longer. This was not the case before the upgrade to Debian Lenny. Before (MySQL 5.0.32), the second statement also took nearly 0.00 sec. So, is this a regression? I noticed that the problem can be fixed by specifying DETERMINISTIC as an additional keyword in the CREATE FUNCTION statement. In this case, the second statement also runs in 0.00 sec. But I'm not sure if DETERMINISTIC is legal here. As you see, the function reads data from the table, and it may be that these data changes. So the question is what DETERMINISTIC means! Two options: 1.) DETERMINISTIC means that the function does not depend on variable data and will _always_ return the same value. This corresponds to what the manual says: "A procedure or function is considered “deterministic” if it always produces the same result for the same input parameters, and “not deterministic” otherwise." But I'm not sure if "always" can be construed that strictly here. It would be inconsistent with the "READS SQL DATA" characteristic. 2.) DETERMINISTIC means that the function does not use any non-constant input except data from the database tables (i.e. no CURRENT_DATE(), random numbers etc.). How would the caching mechanism work in this case? In the second case, I may declare my function DETERMINISTIC, in the first case I may not. Does anybody know what is right here? I did another observations that is closely related to this: Even with the DETERMINISTIC keyword, the following takes long: mysql> SELECT id, _contractRoot(320) FROM contract; +------+--------------------+ | id | _contractRoot(320) | +------+--------------------+ ... | 1560 | 317 | | 1561 | 317 | +------+--------------------+ 1477 rows in set (0.06 sec) The run time seems to be independent of the use of DETERMINISTIC, but to my understanding, there is no need to execute the function more often than when doing mysql> SELECT number FROM contract WHERE id = _contractRoot(320); which returns after 0.00 sec. As I said, this behaviour happens despite of DETERMINISTIC. I'm not sure if this is a bug. Thanks, Peter -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org