[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2014-02-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Helder mybugs.m...@gmail.com changed:

   What|Removed |Added

   See Also||https://bugzilla.wikimedia.
   ||org/show_bug.cgi?id=5589

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2014-02-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

PiRSquared17 pirsquare...@gmail.com changed:

   What|Removed |Added

   See Also||https://bugzilla.wikimedia.
   ||org/show_bug.cgi?id=61840

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-08-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Pyb pierre.beaudo...@gmail.com changed:

   What|Removed |Added

 CC||pierre.beaudo...@gmail.com

--- Comment #44 from Pyb pierre.beaudo...@gmail.com ---
Thx for this new special page.

It doesn't work on [[Catégorie:Portail:Hélicoptères/Articles liés]]

https://fr.wikipedia.org/wiki/Sp%C3%A9cial:RandomInCategory/Portail:H%C3%A9licopt%C3%A8res/Articles_li%C3%A9s

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-08-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #45 from Nemo federicol...@tiscali.it ---
(In reply to comment #44)
 Thx for this new special page.
 
 It doesn't work on [[Catégorie:Portail:Hélicoptères/Articles liés]]
 
 https://fr.wikipedia.org/wiki/Sp%C3%A9cial:RandomInCategory/Portail:
 H%C3%A9licopt%C3%A8res/Articles_li%C3%A9s

The Portail: prefix is being eaten. Can you check if it happens with any prefix
matching the name of a namespace, or just with any : prefix, and file a bug?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-08-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Nemo federicol...@tiscali.it changed:

   What|Removed |Added

   Assignee|vasi...@gmail.com   |bawolff...@gmail.com

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-08-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #46 from Bawolff (Brian Wolff) bawolff...@gmail.com ---
(In reply to comment #44)
 Thx for this new special page.
 
 It doesn't work on [[Catégorie:Portail:Hélicoptères/Articles liés]]
 
 https://fr.wikipedia.org/wiki/Sp%C3%A9cial:RandomInCategory/Portail:
 H%C3%A9licopt%C3%A8res/Articles_li%C3%A9s

This is fixed on master. Next time wikimedia sites are updated, (thursday) it
should be fixed.

Until then, include the category: prefix with the page name and it should work.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-08-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #42 from Gerrit Notification Bot gerritad...@wikimedia.org ---
Change 71997 merged by Brion VIBBER:
Add Special:RandomInCategory.

https://gerrit.wikimedia.org/r/71997

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-08-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Bawolff (Brian Wolff) bawolff...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #43 from Bawolff (Brian Wolff) bawolff...@gmail.com ---
(In reply to comment #41)
 Personally, I'd rather see a schema change or Lucene/Solr improvements cover
 this.

Perhaps open a separate bug for that, since this patch has been merged.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-07-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #41 from MZMcBride b...@mzmcbride.com ---
Personally, I'd rather see a schema change or Lucene/Solr improvements cover
this.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-07-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Nemo federicol...@tiscali.it changed:

   What|Removed |Added

   See Also||https://bugzilla.wikimedia.
   ||org/show_bug.cgi?id=46918

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-07-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Nemo federicol...@tiscali.it changed:

   What|Removed |Added

 CC||federicol...@tiscali.it

--- Comment #40 from Nemo federicol...@tiscali.it ---
(In reply to comment #38)
 Algorithm is:
 *Get earliest and newest cl_timestamp in a category
 *Pick a date in between
 *Pick an offset between 0 and 30
 *Get the page that is offset number of pages after the date picked.
 
 Thoughts?

So, the downside for this is that bulk of pages added to the category in
similar times would be constantly underrepresented, if I understand correctly.
Those might be category renames, bot additions, new templates including the
category... it may make it very hard to clear such big backlogs, but give a
better representation of more human (slow) additions to the category.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-07-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #37 from Gerrit Notification Bot gerritad...@wikimedia.org ---
Change 71997 had a related patch set uploaded by Brian Wolff:
Add Special:RandomInCategory.

https://gerrit.wikimedia.org/r/71997

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-07-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #38 from Bawolff (Brian Wolff) bawolff...@gmail.com ---
(In reply to comment #37)
 Change 71997 had a related patch set uploaded by Brian Wolff:
 Add Special:RandomInCategory.
 
 https://gerrit.wikimedia.org/r/71997

I had an idea for an efficient method that doesn't need a schema change. It
however gives quite biased results in some cases (You can have 2 of cheap [in
the amount of ops work needed for a schema], fast and good. This one is cheap
and fast).

I think this is good enough for the common use case of people just wanting an
entry from a category that is different from last time they hit the random
button. (For example to get a random thing out of articles for cleanup or
whatever). To do something better would need a schema change, or some other
more exotic solution. I think this method could be good enough for now.


Algorithm is:
*Get earliest and newest cl_timestamp in a category
*Pick a date in between
*Pick an offset between 0 and 30
*Get the page that is offset number of pages after the date picked.

Thoughts?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-07-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #39 from Bawolff (Brian Wolff) bawolff...@gmail.com ---
Possible tweak could also be to randomly change wether we do cl_timestamp 
random_timestamp or use cl_timestamp  random_timestamp (along with asc vs
desc), which might even things out if one had a category with mostly old
entries from very long ago, and a few outlier new entries from very recent.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-04-16 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #35 from Bawolff (Brian Wolff) bawolff...@gmail.com ---
(In reply to comment #34)
 Update: The e3 team implemented a limited version of this in
 https://gerrit.wikimedia.org/r/#/c/52468/ and
 https://gerrit.wikimedia.org/r/#/c/51881/ - it only makes it available for a
 pre-configured small set of categories. 
 
 Most of the code required for this is written, it just needs extracting out
 to
 its own Extension (RedisRandomCategory?), and perhaps an API Module. And
 there
 shouldn't be *too* much performance issues in getting this on cluster,
 considering that it is already deployed (albeit in a limited way) by E3.

I was reading up on redis, and it sounds really cool. However what ive gathered
from my brief look is that it stores all data in memory (?) I can't imagine
that would scale to all cats on enwiki (let alone all cats everywhere)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-04-16 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #36 from Yuvi Panda yuvipa...@gmail.com ---
(In reply to comment #35)
 I was reading up on redis, and it sounds really cool. However what ive
 gathered
 from my brief look is that it stores all data in memory (?) I can't imagine
 that would scale to all cats on enwiki (let alone all cats everywhere)

Ah, you're right! Though with appropriate swapping, I suppose you could use it
indefinitely (as it swaps out unused pages). But yes, Redis doesn't look to
meet our exact requirements, as is.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-04-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Yuvi Panda yuvipa...@gmail.com changed:

   What|Removed |Added

 CC||yuvipa...@gmail.com

--- Comment #34 from Yuvi Panda yuvipa...@gmail.com ---
Update: The e3 team implemented a limited version of this in
https://gerrit.wikimedia.org/r/#/c/52468/ and
https://gerrit.wikimedia.org/r/#/c/51881/ - it only makes it available for a
pre-configured small set of categories. 

Most of the code required for this is written, it just needs extracting out to
its own Extension (RedisRandomCategory?), and perhaps an API Module. And there
shouldn't be *too* much performance issues in getting this on cluster,
considering that it is already deployed (albeit in a limited way) by E3.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2013-03-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Michelle Lee Kosik kosi...@mail.com changed:

   What|Removed |Added

 CC||kosi...@mail.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-05-14 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Sumana Harihareswara suma...@panix.com changed:

   What|Removed |Added

 AssignedTo|wikibugs-l@lists.wikimedia. |vasi...@gmail.com
   |org |

--- Comment #33 from Sumana Harihareswara suma...@panix.com 2012-05-15 
01:55:07 UTC ---
Asher says that the easiest path to implementing this in a way that performs
suitably is the precomputed cl_random column+index solution mentioned in
comment 29 -- though it has a real cost in terms of hardware utilization. 
Assigning to Victor to see whether he would like to follow up on this.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-05-03 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #32 from Asher Feldman afeld...@wikimedia.org 2012-05-03 20:58:17 
UTC ---
There's no reason why the indexer couldn't pull in categorylinks instead of
whatever its doing now (parsing wikitext?) but we are currently short on
resources when it comes to developing around lucene.  An upgraded search
infrastructure with real-time indexing and greater accessibility around index
definitions could open the door to all sorts of features that aren't currently
practical in mediawiki at wikipedia scale. 

(In reply to comment #30)
 1) Send lucene an incategory query with a limit of 1 to cheaply get the total
 number of articles indexed in the given category. Stuff this in memcache 
 with a
 reasonable ttl (couple hours?) and try to grab there next time so lucene is
 only called once.
 
 I was under the impression that lucence's incategory only worked for 
 categories
 directly listed on a page (aka not inherited from a template). That would be a
 major negative point for using a lucence based solution (unless that issue
 could be fixed)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #30 from Bawolff bawolff...@gmail.com 2012-04-30 17:42:30 UTC ---
1) Send lucene an incategory query with a limit of 1 to cheaply get the total
number of articles indexed in the given category. Stuff this in memcache with a
reasonable ttl (couple hours?) and try to grab there next time so lucene is
only called once.

I was under the impression that lucence's incategory only worked for categories
directly listed on a page (aka not inherited from a template). That would be a
major negative point for using a lucence based solution (unless that issue
could be fixed)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #31 from MZMcBride b...@mzmcbride.com 2012-04-30 23:18:34 UTC ---
(In reply to comment #30)
1) Send lucene an incategory query with a limit of 1 to cheaply get the total
number of articles indexed in the given category. Stuff this in memcache with 
a
reasonable ttl (couple hours?) and try to grab there next time so lucene is
only called once.
 
 I was under the impression that lucence's incategory only worked for 
 categories
 directly listed on a page (aka not inherited from a template). That would be a
 major negative point for using a lucence based solution (unless that issue
 could be fixed)

True, but tangential. The relevant bug is bug 18861.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #27 from Asher Feldman afeld...@wikimedia.org 2012-04-27 06:47:47 
UTC ---
Our current build of lsearchd won't go deeper than an offset of 10
(SearchEngine.java:protected static int maxoffset = 10;) so for categories
like Living People, we wouldn't be able to provide random results over the full
set, just the first 100k as they appear in the index, which appears to be
ordered on create time.

Actually getting the 100kth result (upper latency bound) takes ~280ms

asher@bast1001:~/srchtest$ curl
'http://search1001:8123/search/enwiki/incategory:%22Living%20people%22?limit=1offset=9searchall=0'
567274
#info search=[search1001,search1001], highlight=[search1005] in 283 ms
#no suggestion
#interwiki 0 0
#results 1
1.4743276 0 Boris_Boillon

If you ditch the join and take the same approach with mysql, it's several times
faster than lucene:

mysql select cl_from from categorylinks where cl_to='Living_people'  limit 1
offset 9;
+--+
| cl_from  |
+--+
| 13546433 |
+--+
1 row in set (0.06 sec)

The worst case for Living_people isn't great (~350ms), but still faster than
lucene would be if we upped lsearchd's max offset:

mysql select cl_from from categorylinks where cl_to='Living_people'  limit 1
offset 56;
+--+
| cl_from  |
+--+
| 27345638 |
+--+
1 row in set (0.35 sec)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #28 from Domas Mituzas domas.mitu...@gmail.com 2012-04-27 
14:13:42 UTC ---
We need more features that scan full datasets to return single row.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #29 from Asher Feldman afeld...@wikimedia.org 2012-04-27 16:43:17 
UTC ---
Domas is of course right. Adding a precomputed cl_random column+index is needed
to make this feature acceptable via mysql. Doing so incurs a permanent cost.
Alternatively, we could add the existing page_random field to the lucene index
and make it searchable to eliminate offset scanning there. The latter may be
cheaper.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-26 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

Asher Feldman afeld...@wikimedia.org changed:

   What|Removed |Added

 CC||afeld...@wikimedia.org

--- Comment #25 from Asher Feldman afeld...@wikimedia.org 2012-04-27 00:05:47 
UTC ---
(In reply to comment #23)
 (In reply to comment #22)
  (In reply to comment #21)
   binasher vvv: RoanKattouw: if the query is like what roan mentioned 
   above,
   -1.  this sort of thing should be done with a search engine and is 
   probably
   even doable with the one we have
  
  What do you mean? Which search engines return random pages?
 
 Search engines have pregenerated document lists stored in an efficient format
 for various criteria. Usually the presence or absence of a given keyword is 
 the
 criterion of interest, but membership in a category can be handled in the same
 way. Since the list is pregenerated, the length is known, so you can choose a
 random offset into the category and perhaps even skip to that offset
 efficiently. Asher probably means that if Lucene doesn't have such a feature
 already, it could be patched in.

Indeed, the method Tim outlines would let you grab a random result from any
search engine that supports pagination.  

You can also get randomized output directly from a search engine given control
over sorting, which would normally be in descending order on an IR score. Solr
has a random result module and it's implementable in Lucene, including version
2 which we run in production.

See the section Bonus! For those of you trapped in Lucene 2 at the bottom of:
http://stackoverflow.com/questions/7201638/lucene-2-9-2-how-to-show-results-in-random-order

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-26 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

--- Comment #26 from Asher Feldman afeld...@wikimedia.org 2012-04-27 01:25:46 
UTC ---
How to implement a category based random feature in wikipedia without touching
the database:

1) Send lucene an incategory query with a limit of 1 to cheaply get the total
number of articles indexed in the given category. Stuff this in memcache with a
reasonable ttl (couple hours?) and try to grab there next time so lucene is
only called once.

2) Send the same category with an offset of rand(0, $doc_count - 1)

3) Redirect to the article returned in step 2.

Command line example to get a random Domestic animals article.

Step one - the very first item in the response is the match count (36 in this
case): 

asher@bast1001:~/srchtest$ curl
'http://search1001:8123/search/enwiki/incategory:%22Domesticated%20animals%22?limit=1'
36
#info search=[search1001,search1001], highlight=[search1004] in 4 ms
#no suggestion
#interwiki 0 0
#results 1
12.586081 0 Genomics_of_domestication
#h.text [] [] [+] date+November+2011
#h.text [] [] []
Genomics+is+the+study+of+the+structure%2C+content%2C+and+evolution++of+genomes+%2C+or+the+entire+genetic+information+of+
#h.date 2012-04-04T12:25:13Z
#h.wordcount 2252
#h.size 15955

Step two - pick a random number between 0 - 35.. let's go with 16.

asher@bast1001:~/srchtest$ curl
'http://search1001:8123/search/enwiki/incategory:%22Domesticated%20animals%22?offset=16limit=1'
36
#info search=[search1001,search1001], highlight=[search1005] in 3 ms
#no suggestion
#interwiki 0 0
#results 1
6.2930403 0 Fancy_pigeon
#h.text [] [] [+]
Fancy+pigeons+are+domesticated++varieties+of+the+Rock+Pigeon++%28Columba+livia%29.+
#h.text [] [] [] They+are+bred+by+pigeon+fanciers++for+various+traits+
#h.date 2012-02-21T12:23:54Z
#h.wordcount 903
#h.size 7354

Hi, Fancy Pigeon!

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 25931] Implement efficient way to select random page from specified category on Wikimedia wikis

2012-04-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=25931

MZMcBride b...@mzmcbride.com changed:

   What|Removed |Added

Summary|Implement way to select |Implement efficient way to
   |random page from specified  |select random page from
   |category|specified category on
   ||Wikimedia wikis

--- Comment #24 from MZMcBride b...@mzmcbride.com 2012-04-26 04:41:53 UTC ---
For those wondering, this bug is not a duplicate of bug 2170. Bug 2170 is about
having the feature generally available in MediaWiki (which was implemented as
the RandomInCategory extension). This bug is about having the feature
available on exceptional MediaWiki installations, namely those that run
Wikimedia wikis.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
You are on the CC list for the bug.

___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l