I'm having an issue with seemingly slow and redundant queries during
runJobs. I'm using MW 1.15.5 and SMW 1.5.2, upgraded from 1.15.1 and
1.5.1 respectively. (The upgrade didn't noticably change the speed
or redundancy).
There are a few templates which are used by many pages, the worst
case so far being ~120k uses of a single template which turns the
~15 template parameters into ~10 properties, a category, and a page.
It uses #if and uses other templates, (wrapping the other template
calls in 'noinclude' doesn't appear to affect the runtime) and is
basically unremarkable.
So, running 'runJobs' gives output like the following:
With --procs=1
2010-09-21 16:14:17 refreshLinks2 Template:TX_Facility start=361172
end=362192 t=2407915 good
2010-09-21 16:53:19 refreshLinks2 Template:TX_Facility start=362193
end=363233 t=2341932 good
(E.g. ~40min per refreshLinks2 with start-end range ~1000)
With --procs=? (I don't remember, but with --procs=1, these were
t=~3500 and increase proportionally with procs, so I think the DB
is the bottleneck.)
2010-09-18 14:00:07 refreshLinks 42-009-34059_(TX_Facility) t=22856 good
2010-09-18 14:00:35 refreshLinks 42-001-31705_(TX_Facility) t=28058 good
2010-09-18 14:01:04 refreshLinks 42-003-01936_(TX_Facility) t=28900 good
Watching the SQL log, I see it pause at this query:
SELECT /* SMW::getInProperties 127.0.0.1 */ DISTINCT smw_title,smw_sortkey
FROM `smw_ids` INNER JOIN `smw_rels2` AS t1 ON t1.p_id=smw_id WHERE
t1.o_id='42720'
And in the MySQL shell I get:
mysql> SELECT DISTINCT smw_title,smw_sortkey FROM `smw_ids` INNER JOIN
`smw_rels2` AS t1 ON t1.p_id=smw_id WHERE t1.o_id='42720';
+--------------+--------------+
| smw_title | smw_sortkey |
+--------------+--------------+
| Well_number | Well number |
| State_number | State number |
+--------------+--------------+
2 rows in set (5.98 sec)
mysql> SELECT COUNT(*) FROM `smw_ids` INNER JOIN `smw_rels2` AS t1 ON
t1.p_id=smw_id WHERE t1.o_id='42720';
+----------+
| COUNT(*) |
+----------+
| 123037 |
+----------+
1 row in set (5.56 sec)
The redundancy is that the SELECT DISTINCT output is the same
every time. MySQL query cache is on, and Memcached is used, but
neither made a noticable difference. (I assume some other query
during runJobs invalidates it...)
So, is this a bug or am I doing something totally wrong?
Thanks,
Dan
------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Semediawiki-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel