Re: [SMW-devel] SMW Performance
On Sonntag, 30. Dezember 2007, Sergey Chernyshev wrote: Hmm. I didn't realize there is a way to remove $smwgQDefaultNamespaces restriction and this will enable all namespaces instead of disabling them. Why is it that this setting not set to NULL by default then? I don't see any point in restricting namespaces unless it's absolutely necessary for security reasons or something. I agree. Done. Markus Sergey On Dec 29, 2007 10:29 AM, Markus Krötzsch [EMAIL PROTECTED] wrote: On Freitag, 28. Dezember 2007, Lau, William (NIH/CIT) [E] wrote: We have a set of semantic queries in a template. That template is used in some pages. However, by looking at the database process list, it seems that those set of queries are processed whenever a page is requested, even when the template is not used by the requested page (e.g. special pages). Do I understand this correctly? The delivery of special pages that are completely unrelated to said template triggers the ask-queries contained therein? This would be very strange behaviour indeed (I cannot currently imagine how or why this should happen in MediaWiki)! All the SQL queries are generated by the getQueryResult function. Since those queries are very computational intensive, this bug slows down the entire site. If we take the inline queries out of the template or change $smwgQEnabled to false, the site becomes fast again. Has anyone experienced the same issue? In general, if queries on some site are too slow, it is useful to configure SMW to support faster querying (with less features, of course). Basic settings one can try to speed up querying are: include_once('extensions/SemanticMediaWiki/includes/SMW_Settings.php'); $smwgQSubcategoryDepth = 0; $smwgQSubpropertyDepth = 0; $smwgQEqualitySupport = SMW_EQ_NONE; $smwgQDefaultNamespaces = NULL; enableSemantics(semedia-wiki.localhost); Those settings will speed up basically all queries, disabling all support for property and category hierarchies, equality (redirects), and namespace restrictions (i.e. queries consider pages in all namespaces, including, e.g., User:). You can experiment which of those, if any, affects your query performance positively. If you have problems with too complex user-generated queries, then the parameters $smwgQMaxSize and $smwgQMaxDepth are an option to restrict this. In general, it should be emphasised that queries should be used in a targetted way. Ontoworld.org had the infamous template {{ask}} for some time, which included queries for almost anything, which would just not appear if no results would be obtained. Most wikis should rather have single query templates for special purposes instead of trying to have one for all. Anyway, for further optimisation, we need some pointer to your site, or at least some statistical information concerning its size (Special:SemanticStatistics) and the query structure. Did you mention the SMW version you use? Some of the above assume SMW1.0-RC3, and none will work prior to SMW1.0-RC1. Markus -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org signature.asc Description: This is a digitally signed message part. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] SMW Performance
On Freitag, 28. Dezember 2007, Lau, William (NIH/CIT) [E] wrote: We have a set of semantic queries in a template. That template is used in some pages. However, by looking at the database process list, it seems that those set of queries are processed whenever a page is requested, even when the template is not used by the requested page (e.g. special pages). Do I understand this correctly? The delivery of special pages that are completely unrelated to said template triggers the ask-queries contained therein? This would be very strange behaviour indeed (I cannot currently imagine how or why this should happen in MediaWiki)! All the SQL queries are generated by the getQueryResult function. Since those queries are very computational intensive, this bug slows down the entire site. If we take the inline queries out of the template or change $smwgQEnabled to false, the site becomes fast again. Has anyone experienced the same issue? In general, if queries on some site are too slow, it is useful to configure SMW to support faster querying (with less features, of course). Basic settings one can try to speed up querying are: include_once('extensions/SemanticMediaWiki/includes/SMW_Settings.php'); $smwgQSubcategoryDepth = 0; $smwgQSubpropertyDepth = 0; $smwgQEqualitySupport = SMW_EQ_NONE; $smwgQDefaultNamespaces = NULL; enableSemantics(semedia-wiki.localhost); Those settings will speed up basically all queries, disabling all support for property and category hierarchies, equality (redirects), and namespace restrictions (i.e. queries consider pages in all namespaces, including, e.g., User:). You can experiment which of those, if any, affects your query performance positively. If you have problems with too complex user-generated queries, then the parameters $smwgQMaxSize and $smwgQMaxDepth are an option to restrict this. In general, it should be emphasised that queries should be used in a targetted way. Ontoworld.org had the infamous template {{ask}} for some time, which included queries for almost anything, which would just not appear if no results would be obtained. Most wikis should rather have single query templates for special purposes instead of trying to have one for all. Anyway, for further optimisation, we need some pointer to your site, or at least some statistical information concerning its size (Special:SemanticStatistics) and the query structure. Did you mention the SMW version you use? Some of the above assume SMW1.0-RC3, and none will work prior to SMW1.0-RC1. Markus -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org signature.asc Description: This is a digitally signed message part. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] SMW Performance
Hmm. I didn't realize there is a way to remove $smwgQDefaultNamespaces restriction and this will enable all namespaces instead of disabling them. Why is it that this setting not set to NULL by default then? I don't see any point in restricting namespaces unless it's absolutely necessary for security reasons or something. Sergey On Dec 29, 2007 10:29 AM, Markus Krötzsch [EMAIL PROTECTED] wrote: On Freitag, 28. Dezember 2007, Lau, William (NIH/CIT) [E] wrote: We have a set of semantic queries in a template. That template is used in some pages. However, by looking at the database process list, it seems that those set of queries are processed whenever a page is requested, even when the template is not used by the requested page (e.g. special pages). Do I understand this correctly? The delivery of special pages that are completely unrelated to said template triggers the ask-queries contained therein? This would be very strange behaviour indeed (I cannot currently imagine how or why this should happen in MediaWiki)! All the SQL queries are generated by the getQueryResult function. Since those queries are very computational intensive, this bug slows down the entire site. If we take the inline queries out of the template or change $smwgQEnabled to false, the site becomes fast again. Has anyone experienced the same issue? In general, if queries on some site are too slow, it is useful to configure SMW to support faster querying (with less features, of course). Basic settings one can try to speed up querying are: include_once('extensions/SemanticMediaWiki/includes/SMW_Settings.php'); $smwgQSubcategoryDepth = 0; $smwgQSubpropertyDepth = 0; $smwgQEqualitySupport = SMW_EQ_NONE; $smwgQDefaultNamespaces = NULL; enableSemantics(semedia-wiki.localhost); Those settings will speed up basically all queries, disabling all support for property and category hierarchies, equality (redirects), and namespace restrictions (i.e. queries consider pages in all namespaces, including, e.g., User:). You can experiment which of those, if any, affects your query performance positively. If you have problems with too complex user-generated queries, then the parameters $smwgQMaxSize and $smwgQMaxDepth are an option to restrict this. In general, it should be emphasised that queries should be used in a targetted way. Ontoworld.org had the infamous template {{ask}} for some time, which included queries for almost anything, which would just not appear if no results would be obtained. Most wikis should rather have single query templates for special purposes instead of trying to have one for all. Anyway, for further optimisation, we need some pointer to your site, or at least some statistical information concerning its size (Special:SemanticStatistics) and the query structure. Did you mention the SMW version you use? Some of the above assume SMW1.0-RC3, and none will work prior to SMW1.0-RC1. Markus -- Markus Krötzsch Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe phone +49 (0)721 608 7362fax +49 (0)721 608 5998 [EMAIL PROTECTED]www http://korrekt.org - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel -- Sergey Chernyshev http://www.sergeychernyshev.com/ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] SMW performance
Forget my previous post. The problem goesaway when I removed one template. It seems the performance issue is related tothe application instead of the database. Try setting up eAccelerator for PHP, maybe it would help a bit. Also, I believe that MW/SMW requires dedicated server (co-location). We've tried usual low-cost hosting (with hundreds of other's virtual hosts) and even MW alone was crawling.. Also, it was slow under Windows. It's ok with Linux server. Of course you could also try dedicated MySQL server. MW even supports MySQL clustering and web load-balancing, because it's being used by wikipedia - one of the busiest and largest sites in the world. Dmitriy - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] SMW performance
Forget my previous post. The problem goes away when I removed one template. It seems the performance issue is related to the application instead of the database. From: Wang, Alex (NIH/CIT) [E] Sent: Thursday, December 27, 2007 4:46 PM To: semediawiki-devel@lists.sourceforge.net Subject: [SMW-devel] SMW performance It is very slow (30 seconds) to open any pages including Main Page. Checking the processes in mysqld found the following sql statements for each page respectively. I guess `prop11`, `prop8`, `cats6` are temporary tables, which make it hard to tune these statements. Does anyone know if these statements look alright? Your help are greatly appreciated. SELECT DISTINCT `page`.page_title as title, `page`.page_namespace as namespace, `page`.page_id as id FROM `prop11`, `prop8`, `cats6`, `page` INNER JOIN `categorylinks` AS cl5 ON cl5.cl_from=`page`.page_id INNER JOIN `smw_relations` AS rel7 ON rel7.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd9 ON rd9.rd_from=rel7.object_id INNER JOIN `smw_relations` AS rel10 ON rel10.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd12 ON rd12.rd_from=rel10.object_id WHERE ((`page`.page_namespace='0') OR (`page`.page_namespace='6')) AND ((cats6.title=cl5.cl_to) AND (prop8.title=rel7.relation_title AND ((rel7.object_title='Main_Page' AND rel7.object_namespace=0) OR (rd9.rd_title='Main_Page' AND rd9.rd_namespace=0))) AND (prop11.title=rel10.relation_title AND ((rel10.object_title='Parotid' AND rel10.object_namespace=0) OR (rd12.rd_title='Parotid' AND rd12.rd_namespace=0 LIMIT 51 SELECT DISTINCT `page`.page_title as title, `page`.page_namespace as namespace, `page`.page_id as id FROM `prop11`, `prop8`, `cats6`, `page` INNER JOIN `categorylinks` AS cl5 ON cl5.cl_from=`page`.page_id INNER JOIN `smw_relations` AS rel7 ON rel7.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd9 ON rd9.rd_from=rel7.object_id INNER JOIN `smw_relations` AS rel10 ON rel10.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd12 ON rd12.rd_from=rel10.object_id WHERE ((`page`.page_namespace='0') OR (`page`.page_namespace='6')) AND ((cats6.title=cl5.cl_to) AND (prop8.title=rel7.relation_title AND ((rel7.object_title='Monobook.css' AND rel7.object_namespace=0) OR (rd9.rd_title='Monobook.css' AND rd9.rd_namespace=0))) AND (prop11.title=rel10.relation_title AND ((rel10.object_title='Parotid' AND rel10.object_namespace=0) OR (rd12.rd_title='Parotid' AND rd12.rd_namespace=0 LIMIT 51 SELECT DISTINCT `page`.page_title as title, `page`.page_namespace as namespace, `page`.page_id as id FROM `prop11`, `prop8`, `cats6`, `page` INNER JOIN `categorylinks` AS cl5 ON cl5.cl_from=`page`.page_id INNER JOIN `smw_relations` AS rel7 ON rel7.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd9 ON rd9.rd_from=rel7.object_id INNER JOIN `smw_relations` AS rel10 ON rel10.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd12 ON rd12.rd_from=rel10.object_id WHERE ((`page`.page_namespace='0') OR (`page`.page_namespace='6')) AND ((cats6.title=cl5.cl_to) AND (prop8.title=rel7.relation_title AND ((rel7.object_title='Protein_Clusters' AND rel7.object_namespace=0) OR (rd9.rd_title='Protein_Clusters' AND rd9.rd_namespace=0))) AND (prop11.title=rel10.relation_title AND ((rel10.object_title='Parotid' AND rel10.object_namespace=0) OR (rd12.rd_title='Parotid' AND rd12.rd_namespace=0 LIMIT 51 SELECT DISTINCT `page`.page_title as title, `page`.page_namespace as namespace, `page`.page_id as id FROM `prop11`, `prop8`, `cats6`, `page` INNER JOIN `categorylinks` AS cl5 ON cl5.cl_from=`page`.page_id INNER JOIN `smw_relations` AS rel7 ON rel7.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd9 ON rd9.rd_from=rel7.object_id INNER JOIN `smw_relations` AS rel10 ON rel10.subject_id=`page`.page_id LEFT JOIN `redirect` AS rd12 ON rd12.rd_from=rel10.object_id WHERE ((`page`.page_namespace='0') OR (`page`.page_namespace='6')) AND ((cats6.title=cl5.cl_to) AND (prop8.title=rel7.relation_title AND ((rel7.object_title='HSPP_Cluster:0067' AND rel7.object_namespace=0) OR (rd9.rd_title='HSPP_Cluster:0067' AND rd9.rd_namespace=0))) AND (prop11.title=rel10.relation_title AND ((rel10.object_title='Parotid' AND rel10.object_namespace=0) OR (rd12.rd_title='Parotid' AND rd12.rd_namespace=0 LIMIT 51 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel