[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-06-14 Thread Nemo_bis
Nemo_bis added a comment.

In T144592#3038988, @Nemo_bis wrote:
Sorry for triple message... do I see correctly (https://archive.fo/MakpZ ) that currently https://cy.wikipedia.org/wiki/Arbennig:AboutTopic/Q272 is the only URL actually indexed by Google?


Now searching the localised special page name:
https://duckduckgo.com/?q="Arbennig%3AAm_y_Pwnc"+site%3Acy.wikipedia.org
DuckDuckGo shows a few mostly-welsh results: https://archive.fo/iRyCb

Google picks up results which are mostly in English such as (2nd for me):

Wanfried agreement - Wicipedia
 https://cy.wikipedia.org/wiki/Arbennig:Am_y_Pwnc/Q1441
 treaty transferring territory between the United States and Soviet occupation zones of Germany after World War II. Karte Wanfrieder Abkommen.png

https://archive.fo/SVu8tTASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, Nemo_bisCc: thiemowmde, Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, GoranSMilovanovic, QZanden, cmadeo, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-22 Thread hoo
hoo added a comment.

In T144592#3039068, @thiemowmde wrote:
If you click to show "duplicate" search results, you find that Google tries to index URLs like https://cy.wikipedia.org/wiki.phtml?title=Special:AboutTopic/Q2050, but can't because of https://cy.wikipedia.org/robots.txt it says, but I can not track down the rule. The problem here is not that it can't index these URLs. This is fine. The problem is: How does it even find these weird URLs?


I gave some of these to google in order to experiment a bit. These should not be ranked highly and wont appear in any real-world searches.

I guess it will take another weeks until Google and other search engines start picking up the other placeholders :/TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: thiemowmde, Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-19 Thread thiemowmde
thiemowmde added a comment.
Yes, this is still the only article for now: https://www.google.com/search?q=site:cy.wikipedia.org+inurl:AboutTopic

If you click to show "duplicate" search results, you find that Google tries to index URLs like https://cy.wikipedia.org/wiki.phtml?title=Special:AboutTopic/Q2050, but can't because of https://cy.wikipedia.org/robots.txt it says, but I can not track down the rule. The problem here is not that it can't index these URLs. This is fine. The problem is: How does it even find these weird URLs?TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, thiemowmdeCc: thiemowmde, Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-19 Thread Nemo_bis
Nemo_bis added a comment.
Sorry for triple message... do I see correctly (https://archive.fo/MakpZ ) that currently https://cy.wikipedia.org/wiki/Arbennig:AboutTopic/Q272 is the only URL actually indexed by Google?TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, Nemo_bisCc: Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-19 Thread Nemo_bis
Nemo_bis added a comment.
I note however that https://www.mediawiki.org/w/index.php?diff=2373589 contradicts the task description, since it says «all placeholders for Items that have an id up Q3000» (bold added).TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, Nemo_bisCc: Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-19 Thread Nemo_bis
Nemo_bis added a comment.
Thanks for updating the task description.TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, Nemo_bisCc: Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-07 Thread Stashbot
Stashbot added a comment.
Mentioned in SAL (#wikimedia-operations) [2017-02-07T14:12:14Z]  Synchronized wmf-config/: Search index article placeholders on cywiki up to Q2794 (T144592) (duration: 00m 42s)TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, StashbotCc: Stashbot, gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, Th3d3v1ls, Ramalepe, Liugev6, Lewizho99, Maathavan, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-07 Thread gerritbot
gerritbot added a comment.
Change 336225 merged by jenkins-bot:
Search index article placeholders on cywiki up to Q2794

https://gerrit.wikimedia.org/r/336225TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, gerritbotCc: gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, Th3d3v1ls, Ramalepe, Liugev6, Lewizho99, Maathavan, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-02-06 Thread gerritbot
gerritbot added a comment.
Change 336225 had a related patch set uploaded (by Hoo man):
Search index article placeholders up to Q2794

https://gerrit.wikimedia.org/r/336225TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T144592: Search index a limited number of article placeholders on cywiki for testing and evaluation purposes

2017-01-31 Thread hoo
hoo added a comment.
Concrete query used:

SELECT page_title FROM page INNER JOIN wb_entity_per_page ON epp_page_id = page_id INNER JOIN page_props AS pp_sl ON pp_sl.pp_page = page_id AND pp_sl.pp_propname = 'wb-sitelinks' INNER JOIN page_props AS pp_st ON pp_st.pp_page_id AND pp_st.pp_propname = 'wb-claims' WHERE pp_st.pp_value > 2 AND pp_sl.pp_value > 3 AND NOT EXISTS(SELECT 1 FROM wb_items_per_site WHERE ips_site_id = 'cywiki' AND ips_item_id = epp_entity_id) ORDER BY epp_entity_id ASC LIMIT 1000;

Results (indexable user page): https://cy.wikipedia.org/wiki/Defnyddiwr:Hoo_man/T144592-placeholders

Note: The placeholders themselves are not indexable, yet.TASK DETAILhttps://phabricator.wikimedia.org/T144592EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hooCc: Deskana, Izno, Lydia_Pintscher, Aklapper, Lucie, Ricordisamoa, Nemo_bis, DarTar, MZMcBride, hoo, D3r1ck01, Wikidata-bugs, aude, jayvdb, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs