Brian Wolff has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/71997


Change subject: Add Special:RandomInCategory.
......................................................................

Add Special:RandomInCategory.

This is meant mostly to spur discussion. The method used is quite
biased, but I believe its the best possible without a schema change
and still being efficient.

The question at hand - is this method acceptable. The method used
is to chose a random timestamp and look at cl_timestamp. This method
will give good results if the timestamps are uniformly distributed
(which probably is not usually true). I think it may give acceptable
results in general, especially given most people are not interested
in true randomness, but more in "give me a result I haven't seen before".
(For example, to pick a random entry in a maintenance category to clean
up).

It also fudges the result a little bit using offset to stop really
biased results from happening.

Bug: 25931
Change-Id: I0c48e4a236b50fb627af94f0df47fef8372ea14d
---
M RELEASE-NOTES-1.22
M includes/AutoLoader.php
M includes/SpecialPageFactory.php
A includes/specials/SpecialRandomInCategory.php
M languages/messages/MessagesEn.php
M languages/messages/MessagesQqq.php
M maintenance/language/messages.inc
7 files changed, 291 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/core 
refs/changes/97/71997/1

diff --git a/RELEASE-NOTES-1.22 b/RELEASE-NOTES-1.22
index 45a6f01..0d06ff2 100644
--- a/RELEASE-NOTES-1.22
+++ b/RELEASE-NOTES-1.22
@@ -137,6 +137,7 @@
   also granting the ability to protect and unprotect.
 * (bug 48256) Make brackets in section edit links accessible to CSS.
   They are now wrapped in <span class="mw-editsection-bracket" />.
+* (bug 25931) Add Special:RandomInCategory.
 
 === Bug fixes in 1.22 ===
 * Disable Special:PasswordReset when $wgEnableEmail is false. Previously one
diff --git a/includes/AutoLoader.php b/includes/AutoLoader.php
index 6f8cd4b..09d5f90 100644
--- a/includes/AutoLoader.php
+++ b/includes/AutoLoader.php
@@ -932,6 +932,7 @@
        'ProtectedPagesPager' => 'includes/specials/SpecialProtectedpages.php',
        'ProtectedTitlesPager' => 
'includes/specials/SpecialProtectedtitles.php',
        'RandomPage' => 'includes/specials/SpecialRandompage.php',
+       'RandomInCategory' => 'includes/specials/SpecialRandomInCategory.php',
        'ShortPagesPage' => 'includes/specials/SpecialShortpages.php',
        'SpecialActiveUsers' => 'includes/specials/SpecialActiveusers.php',
        'SpecialAllmessages' => 'includes/specials/SpecialAllmessages.php',
diff --git a/includes/SpecialPageFactory.php b/includes/SpecialPageFactory.php
index 4d63553..c4b3dd0 100644
--- a/includes/SpecialPageFactory.php
+++ b/includes/SpecialPageFactory.php
@@ -130,6 +130,7 @@
                // Redirecting special pages
                'LinkSearch'                => 'LinkSearchPage',
                'Randompage'                => 'Randompage',
+               'RandomInCategory'          => 'RandomInCategory',
                'Randomredirect'            => 'SpecialRandomredirect',
 
                // High use pages
diff --git a/includes/specials/SpecialRandomInCategory.php 
b/includes/specials/SpecialRandomInCategory.php
new file mode 100644
index 0000000..3b2f964
--- /dev/null
+++ b/includes/specials/SpecialRandomInCategory.php
@@ -0,0 +1,264 @@
+<?php
+/**
+ * Implements Special:RandomInCategory
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ * http://www.gnu.org/copyleft/gpl.html
+ *
+ * @file
+ * @ingroup SpecialPage
+ * @author Brian Wolff
+ */
+
+/**
+ * Special page to direct the user to a random page
+ *
+ * @note The method used here is rather biased. It is assumed that
+ * the use of this page will be people wanting to get a random page
+ * out of a maintenance category, to fix it up. The method used by
+ * this page should return different pages in an unpredictable fashion
+ * which is hoped to be sufficient, even if some pages are selected
+ * more often than others.
+ *
+ * A more unbiased method could be achieved by adding a cl_random field
+ * to the categorylinks table.
+ *
+ * The method used here is as follows:
+ *  * Find the smallest and largest timestamp in the category
+ *  * Pick a random timestamp in between
+ *  * Pick an offset between 0 and 30
+ *  * Get the offset'ed page that is newer than the timestamp selected
+ * The offset is meant to counter the fact the timestamps aren't usually
+ * uniformly distributed, so if things are very non-uniform at least we
+ * won't have the same page selected 99% of the time.
+ *
+ * @ingroup SpecialPage
+ */
+class RandomInCategory extends SpecialPage {
+       protected $extra = array(); // Extra SQL statements
+       protected $category = false; // Title object of category
+       protected $maxOffset = 30; // Max amount to fudge randomness by.
+       private $maxTimestamp = null;
+       private $minTimestamp = null;
+
+       public function __construct( $name = 'RandomInCategory' ) {
+               parent::__construct( $name );
+       }
+
+       /**
+        * Set which category to use.
+        * @param Title $cat
+        */
+       public function setCategory( Title $cat ) {
+               $this->category = $cat;
+               $this->maxTimestamp = null;
+               $this->minTimestamp = null;
+       }
+
+       public function execute( $par ) {
+               $cat = false;
+
+               $categoryStr = $this->getRequest()->getText( 'category', $par );
+
+               if ( $categoryStr ) {
+                       $cat = Title::newFromText( $categoryStr, NS_CATEGORY );
+               }
+
+               if ( $cat ) {
+                       $this->setCategory( $cat );
+               }
+
+
+               if ( !$this->category && $categoryStr ) {
+                       $this->setHeaders();
+                       // For grep: uses message 
randomincategory-invalidcategory
+                       $this->getOutput()->addWikiMsg( strtolower( 
$this->getName() ) . '-invalidcategory',
+                               wfEscapeWikiText( $categoryStr ) );
+
+                       return;
+               } elseif ( !$this->category ) {
+                       $this->setHeaders();
+                       // For grep: uses message 
randomincategory-selectcategory, randomincategory-selectcategory-submit.
+                       $input = Html::input( 'category' );
+                       $submitText = wfMessage( strtolower( $this->getName() ) 
. '-selectcategory-submit' )->text();
+                       $submit = Html::input( '', $submitText, 'submit' );
+
+                       $msg = $this->msg( strtolower( $this->getName() ) . 
'-selectcategory' );
+                       $form = Html::rawElement( 'form', array( 'action' => 
$this->getTitle()->getLocalUrl() ),
+                               $msg->rawParams( $input, $submit )->parse()
+                       );
+                       $this->getOutput()->addHtml( $form );
+
+                       return;
+               }
+
+               $title = $this->getRandomTitle();
+
+               if ( is_null( $title ) ) {
+                       $this->setHeaders();
+                       // For grep: Uses message randomincategory-nopages
+                       $this->getOutput()->addWikiMsg( strtolower( 
$this->getName() ) . '-nopages',
+                               $this->category->getText() );
+
+                       return;
+               }
+
+               $query = $this->getRequest()->getValues();
+               unset( $query['title'] );
+               unset( $query['category'] );
+               $this->getOutput()->redirect( $title->getFullURL( $query ) );
+       }
+
+       /**
+        * Choose a random title.
+        * @return Title object (or null if nothing to choose from)
+        */
+       public function getRandomTitle() {
+               // Convert to float, since we do math with the random number.
+               $rand = floatval( wfRandom() );
+               $title = null;
+
+               // Given that timestamps are rather unevenly distributed, we 
also
+               // use an offset between 0 and 30 to make any biases less 
noticeable.
+               $offset = mt_rand( 0, $this->maxOffset );
+
+               $row = $this->selectRandomPageFromDB( $rand, $offset );
+
+               // Try again without the timestamp offset (wrap around the end)
+               if ( !$row ) {
+                       $row = $this->selectRandomPageFromDB( 0, $offset );
+               }
+
+               // Maybe the category is really small and offset too high
+               if ( !$row ) {
+                       $row = $this->selectRandomPageFromDB( $rand, 0 );
+               }
+
+               // Just get the first entry.
+               if ( !$row ) {
+                       $row = $this->selectRandomPageFromDB( 0, 0 );
+               }
+
+               if ( $row ) {
+                       return Title::makeTitleSafe( $row->page_namespace, 
$row->page_title );
+               }
+
+               return null;
+       }
+
+       protected function getQueryInfo( $rand, $offset ) {
+               if ( !$this->category instanceof Title ) {
+                       throw new MWException( 'No category set' );
+               }
+               $qi = array(
+                       'tables' => array( 'categorylinks', 'page' ),
+                       'fields' => array( 'page_title', 'page_namespace' ),
+                       'conds' => array_merge( array(
+                               'cl_to' => $this->category->getDBKey(),
+                       ), $this->extra ),
+                       'options' => array(
+                               'ORDER BY' => 'cl_timestamp',
+                               'LIMIT' => 1,
+                               'OFFSET' => $offset
+                       ),
+                       'join_conds' => array(
+                               'page' => array( 'INNER JOIN', 'cl_from = 
page_id' )
+                       )
+               );
+               $minClTime = $this->getTimestampOffset( $rand );
+               if ( $minClTime ) {
+                       $dbr = wfGetDB( DB_SLAVE );
+                       $qi['conds'][] = 'cl_timestamp > ' . $dbr->addQuotes( 
$minClTime );
+               }
+               return $qi;
+       }
+
+       /**
+        * @param float $rand Random number between 0 and 1
+        *
+        * @return string|bool A random timestamp from the range of the 
category or false on failure
+        */
+       protected function getTimestampOffset( $rand ) {
+               if ( !$this->minTimestamp || !$this->maxTimestamp ) {
+                       try {
+                               list( $this->minTimestamp, $this->maxTimestamp 
) = $this->getMinAndMaxForCat( $this->category );
+                       } catch( MWException $e ) {
+                               // Possibly no entries in category.
+                               return false;
+                       }
+               }
+
+               $ts = ( $this->maxTimestamp - $this->minTimestamp ) * $rand + 
$this->minTimestamp;
+
+               // XXX: The cl_timestamp field is weird in mysql schema and 
uses a different format
+               // than what $dbr->timestamp() yields
+               return wfTimestamp( TS_DB, intval( $ts ) );
+       }
+
+       /**
+        * Get the lowest and highest timestamp for a category.
+        *
+        * @param Title $category
+        * @return Array The lowest and highest timestamp
+        * @throws MWException if category has no entries.
+        */
+       protected function getMinAndMaxForCat( Title $category ) {
+               $dbr = wfGetDB( DB_SLAVE );
+               $res = $dbr->selectRow(
+                       'categorylinks',
+                       array(
+                               'MIN( cl_timestamp ) AS low',
+                               'MAX( cl_timestamp ) AS high'
+                       ),
+                       array(
+                               'cl_to' => $this->category->getDBKey(),
+                       ),
+                       __METHOD__,
+                       array(
+                               'LIMIT' => 1
+                       )
+               );
+               if ( !$res ) {
+                       throw new MWException( 'No entries in category' );
+               }
+               return array( wfTimestamp( TS_UNIX, $res->low ), wfTimestamp( 
TS_UNIX, $res->high ) );
+       }
+
+       /**
+        * @param float $rand A random number that is converted to a random 
timestamp
+        * @param int $offset A small offset to make the result seem more 
"random"
+        * @param String $fname The name of the calling method
+        * @return Array Info for the title selected.
+        */
+       private function selectRandomPageFromDB( $rand, $offset, $fname = 
__METHOD__ ) {
+               $dbr = wfGetDB( DB_SLAVE );
+
+               $query = $this->getQueryInfo( $rand, $offset );
+               $res = $dbr->select(
+                       $query['tables'],
+                       $query['fields'],
+                       $query['conds'],
+                       $fname,
+                       $query['options'],
+                       $query['join_conds']
+               );
+
+               return $dbr->fetchObject( $res );
+       }
+
+       protected function getGroupName() {
+               return 'redirects';
+       }
+}
diff --git a/languages/messages/MessagesEn.php 
b/languages/messages/MessagesEn.php
index d675e17..5c51058 100644
--- a/languages/messages/MessagesEn.php
+++ b/languages/messages/MessagesEn.php
@@ -446,6 +446,7 @@
        'Protectedpages'            => array( 'ProtectedPages' ),
        'Protectedtitles'           => array( 'ProtectedTitles' ),
        'Randompage'                => array( 'Random', 'RandomPage' ),
+       'RandomInCategory'          => array( 'RandomInCategory' ),
        'Randomredirect'            => array( 'RandomRedirect' ),
        'Recentchanges'             => array( 'RecentChanges' ),
        'Recentchangeslinked'       => array( 'RecentChangesLinked', 
'RelatedChanges' ),
@@ -2602,6 +2603,15 @@
 'randompage-nopages' => 'There are no pages in the following 
{{PLURAL:$2|namespace|namespaces}}: $1.',
 'randompage-url'     => 'Special:Random', # do not translate or duplicate this 
message to other languages
 
+# Random page in category
+'randomincategory'                  => 'Random page in category',
+'randomincategory-invalidcategory'  => '"$1" is not a valid category name.',
+'randomincategory-nopages'          => 'There are no pages in 
[[:Category:$1]].',
+'randomincategory-selectcategory'   => 'Get random page from category $1 $2.
+
+<small>The selection process of this page is biased and should not be used for 
statistical purposes.</small>',
+'randomincategory-selectcategory-submit' => 'Go',
+
 # Random redirect
 'randomredirect'         => 'Random redirect',
 'randomredirect-nopages' => 'There are no redirects in the namespace "$1".',
diff --git a/languages/messages/MessagesQqq.php 
b/languages/messages/MessagesQqq.php
index 616ddcd..3daf239 100644
--- a/languages/messages/MessagesQqq.php
+++ b/languages/messages/MessagesQqq.php
@@ -4041,6 +4041,12 @@
 'randompage-nopages' => '* $1 - list of namespaces
 * $2 - number of namespaces',
 
+'randomincategory'                  => '{{doc-special|RandomInCategory}}',
+'randomincategory-invalidcategory'  => 'Message shown if an invalid category 
is specified. (Note, if the category is simply empty, but could possibly exist, 
{{msg-mw|randomincategory-nopages}} is shown instead). $1 is the invalid 
category name given.',
+'randomincategory-nopages'          => 'Message shown from 
Special:RandomInCategory if the category is empty. $1 is the category name 
(without the namespace prefix)',
+'randomincategory-selectcategory'   => 'Shown on Special:RandomInCategory if 
no category is selected. Displays a form allowing the user to input a category 
name. $1 is the text field input box, $2 is the go button. The text content of 
the button comes from {{msg-mw|randomcategory-selectcategory}}.',
+'randomincategory-selectcategory-submit' => 'Text of button used in 
{{msg-mw|randomcategory-selectcategory}}',
+
 # Random redirect
 'randomredirect' => '{{doc-special|RandomRedirect}}',
 'randomredirect-nopages' => '* $1 - namespace name',
diff --git a/maintenance/language/messages.inc 
b/maintenance/language/messages.inc
index 2cb94cf..5ac1b63 100644
--- a/maintenance/language/messages.inc
+++ b/maintenance/language/messages.inc
@@ -1664,6 +1664,13 @@
                'randompage-nopages',
                'randompage-url',
        ),
+       'randomincategory' => array(
+               'randomincategory',
+               'randomincategory-invalidcategory',
+               'randomincategory-nopages',
+               'randomincategory-selectcategory',
+               'randomincategory-selectcategory-submit',
+       ),
        'randomredirect' => array(
                'randomredirect',
                'randomredirect-nopages',
@@ -4005,6 +4012,7 @@
        'listredirects'       => 'List redirects',
        'unusedtemplates'     => 'Unused templates',
        'randompage'          => 'Random page',
+       'randomincategory'    => 'Special:RandomInCategory,
        'randomredirect'      => 'Random redirect',
        'statistics'          => 'Statistics',
        'disambiguations'     => '',

-- 
To view, visit https://gerrit.wikimedia.org/r/71997
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c48e4a236b50fb627af94f0df47fef8372ea14d
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/core
Gerrit-Branch: master
Gerrit-Owner: Brian Wolff <bawolff...@gmail.com>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to