jenkins-bot has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/401776 )

Change subject: Hygiene: Begin refactoring of MobileFormatter mega class
......................................................................


Hygiene: Begin refactoring of MobileFormatter mega class

After a discussion with Piotr we agreed that the MobileFormatter
code is becoming unmanageable and hard to test.

Going forward we will aim to reduce all transforms to simple classes
that take a DomElement and transform it into another DomElement.
This will improve testability and sanity.

As a first pass we will only update the lead paragraph transform
given that we are currently working on improving that feature in
T170006

To prove this is not specific to the lead paragraph we add the
AddMobileTocTransform to demonstrate this is extensible

Additional change:
* Move logic outside MobileFormatter to before creation of class
* Remove superfulous test - the lead paragraph transform is never
invoked outside main namespace so it's impossible to log things in
the main namespace. This test thus serves no purpose.

Bug: T170006
Change-Id: I1fb2dc9e7ad1fe0431819ce3ca6295b3df8410e0
---
M extension.json
M includes/MobileFormatter.php
M includes/MobileFrontend.body.php
A includes/transforms/AddMobileTocTransform.php
A includes/transforms/IMobileTransform.php
A includes/transforms/MoveLeadParagraphTransform.php
A includes/transforms/NoTransform.php
M tests/phpunit/MobileFormatterTest.php
A tests/phpunit/transforms/MoveLeadParagraphTransformTest.php
9 files changed, 219 insertions(+), 127 deletions(-)

Approvals:
  Pmiazga: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/extension.json b/extension.json
index d650d4d..351d06e 100644
--- a/extension.json
+++ b/extension.json
@@ -45,6 +45,9 @@
        "ExtensionMessagesFiles": {
                "MobileFrontendAlias": "MobileFrontend.alias.php"
        },
+       "AutoloadNamespaces": {
+               "MobileFrontend\\Transforms\\": "includes/transforms/"
+       },
        "AutoloadClasses": {
                "ExtMobileFrontend": "includes/MobileFrontend.body.php",
                "MobileFrontendHooks": "includes/MobileFrontend.hooks.php",
diff --git a/includes/MobileFormatter.php b/includes/MobileFormatter.php
index cfabb07..e64249c 100644
--- a/includes/MobileFormatter.php
+++ b/includes/MobileFormatter.php
@@ -2,6 +2,9 @@
 
 use HtmlFormatter\HtmlFormatter;
 use MobileFrontend\ContentProviders\IContentProvider;
+use MobileFrontend\Transforms\MoveLeadParagraphTransform;
+use MobileFrontend\Transforms\AddMobileTocTransform;
+use MobileFrontend\Transforms\NoTransform;
 
 /**
  * Converts HTML into a mobile-friendly version
@@ -218,94 +221,6 @@
                }
        }
 
-       /**
-        * Move the first paragraph in the lead section above the infobox
-        *
-        * In order for a paragraph to be moved the following conditions must 
be met:
-        *   - the lead section contains at least one infobox;
-        *   - the paragraph doesn't already appear before the first infobox
-        *     if any in the DOM;
-        *   - the paragraph contains text content, e.g. no <p></p>;
-        *   - the paragraph doesn't contain coordinates, i.e. span#coordinates.
-        *   - article belongs to the MAIN namespace
-        *
-        * Additionally if paragraph immediate sibling is a list (ol or ul 
element), the list
-        * is also moved along with paragraph above infobox.
-        *
-        * Note that the first paragraph is not moved before hatnotes, or mbox 
or other
-        * elements that are not infoboxes.
-        *
-        * @param DOMElement $leadSectionBody
-        * @param DOMDocument $doc Document to which the section belongs
-        */
-       private function moveFirstParagraphBeforeInfobox( $leadSectionBody, 
$doc ) {
-               // Move lead parapgraph only on pages in MAIN namespace (see 
@T163805)
-               if ( $this->title->getNamespace() !== NS_MAIN ) {
-                       return;
-               }
-               $xPath = new DOMXPath( $doc );
-               // Find infoboxes and paragraphs that have text content, i.e. 
paragraphs
-               // that are not empty nor are wrapper paragraphs that contain 
span#coordinates.
-               $infoboxAndParagraphs = $xPath->query(
-                       './table[contains(@class,"infobox")] | 
./p[string-length(text()) > 0]',
-                       $leadSectionBody
-               );
-               // We need both an infobox and a paragraph and the first 
element of our query result
-               // ought to be an infobox.
-               if ( $infoboxAndParagraphs->length >= 2 &&
-                       $infoboxAndParagraphs->item( 0 )->nodeName == 'table'
-               ) {
-                       $firstP = null;
-                       for ( $i = 1; $i < $infoboxAndParagraphs->length; $i++ 
) {
-                               if ( $infoboxAndParagraphs->item( $i 
)->nodeName == 'p' ) {
-                                       $firstP = $infoboxAndParagraphs->item( 
$i );
-                                       break;
-                               }
-                       }
-                       if ( $firstP ) {
-                               $listElementAfterParagraph = null;
-                               $where = $infoboxAndParagraphs->item( 0 );
-
-                               $elementAfterParagraphQuery = $xPath->query( 
'following-sibling::*[1]', $firstP );
-                               if ( $elementAfterParagraphQuery->length > 0 ) {
-                                       $elem = 
$elementAfterParagraphQuery->item( 0 );
-                                       if ( $elem->tagName === 'ol' || 
$elem->tagName === 'ul' ) {
-                                               $listElementAfterParagraph = 
$elem;
-                                       }
-                               }
-
-                               $leadSectionBody->insertBefore( $firstP, $where 
);
-                               if ( $listElementAfterParagraph !== null ) {
-                                       $leadSectionBody->insertBefore( 
$listElementAfterParagraph, $where );
-                               }
-                       }
-               }
-               /**
-                * @see https://phabricator.wikimedia.org/T149884
-                * @todo remove after research is done
-                */
-               if ( MobileContext::singleton()->getMFConfig()->get( 
'MFLogWrappedInfoboxes' ) ) {
-                       $this->logInfoboxesWrappedInContainers( 
$leadSectionBody, $xPath );
-               }
-       }
-
-       /**
-        * Finds all infoboxes which are one or more levels deep in $xPath 
content. When at least one
-        * element is found - log the page title and revision
-        *
-        * @see https://phabricator.wikimedia.org/T149884
-        * @param $leadSectionBody
-        * @param DOMXPath $xPath
-        */
-       private function logInfoboxesWrappedInContainers( $leadSectionBody, 
DOMXPath $xPath ) {
-               $infoboxes = $xPath->query( 
'./*//table[contains(@class,"infobox")]' .
-                       '[not(ancestor::table[contains(@class,"infobox")])]', 
$leadSectionBody );
-               if ( $infoboxes->length > 0 ) {
-                       \MediaWiki\Logger\LoggerFactory::getInstance( 'mobile' 
)->info(
-                               "Found infobox wrapped with container on 
{$this->title} (rev:{$this->revId})"
-                       );
-               }
-       }
        /**
         * Replaces any references links with a link to Special:MobileCite
         *
@@ -646,6 +561,11 @@
         * @param array $transformOptions Options to pass when transforming 
content per section
         */
        protected function makeSections( DOMDocument $doc, array $headings, 
array $transformOptions ) {
+               $noTransform = new NoTransform();
+               $tocTransform = $this->isTOCEnabled ? new 
AddMobileTocTransform() : $noTransform;
+               $leadTransform = $transformOptions[ 
self::SHOW_FIRST_PARAGRAPH_BEFORE_INFOBOX ] ?
+                               new MoveLeadParagraphTransform( $this->title, 
$this->revId ) : $noTransform;
+
                // Find the parser output wrapper div
                $xpath = new DOMXPath( $doc );
                $containers = $xpath->query( 
'body/div[@class="mw-parser-output"][1]' );
@@ -686,18 +606,8 @@
                                $container->insertBefore( $sectionBody, $node );
 
                                if ( $sectionNumber === 0 ) {
-                                       if ( $this->isTOCEnabled ) {
-                                               // Insert table of content 
placeholder which will be progressively enhanced via JS
-                                               $toc = $doc->createElement( 
'div' );
-                                               $toc->setAttribute( 'id', 'toc' 
);
-                                               $toc->setAttribute( 'class', 
'toc-mobile' );
-                                               $tocHeading = 
$doc->createElement( 'h2', wfMessage( 'toc' )->text() );
-                                               $toc->appendChild( $tocHeading 
);
-                                               $sectionBody->appendChild( $toc 
);
-                                       }
-                                       if ( $transformOptions[ 
self::SHOW_FIRST_PARAGRAPH_BEFORE_INFOBOX ] ) {
-                                               
$this->moveFirstParagraphBeforeInfobox( $sectionBody, $doc );
-                                       }
+                                       $tocTransform->apply( $sectionBody );
+                                       $leadTransform->apply( $sectionBody );
                                }
                                $sectionNumber += 1;
                                $sectionBody = $this->createSectionBodyElement( 
$doc, $sectionNumber, $this->scriptsEnabled );
@@ -710,8 +620,8 @@
                }
 
                // If the document had the lead section only:
-               if ( $sectionNumber == 0 && $transformOptions[ 
self::SHOW_FIRST_PARAGRAPH_BEFORE_INFOBOX ] ) {
-                       $this->moveFirstParagraphBeforeInfobox( $sectionBody, 
$doc );
+               if ( $sectionNumber == 0 ) {
+                       $leadTransform->apply( $sectionBody );
                }
 
                if ( $sectionBody->hasChildNodes() ) {
diff --git a/includes/MobileFrontend.body.php b/includes/MobileFrontend.body.php
index 3d2688b..45f4196 100644
--- a/includes/MobileFrontend.body.php
+++ b/includes/MobileFrontend.body.php
@@ -63,7 +63,8 @@
 
                $removeImages = $context->isLazyLoadImagesEnabled();
                $removeReferences = $context->isLazyLoadReferencesEnabled();
-               $showFirstParagraphBeforeInfobox = 
$context->shouldShowFirstParagraphBeforeInfobox();
+               $showFirstParagraphBeforeInfobox = 
$context->shouldShowFirstParagraphBeforeInfobox()
+                       && $ns === NS_MAIN;
 
                if ( $context->getContentTransformations() ) {
                        // Remove images if they're disabled from special 
pages, but don't transform otherwise
diff --git a/includes/transforms/AddMobileTocTransform.php 
b/includes/transforms/AddMobileTocTransform.php
new file mode 100644
index 0000000..3d5bcde
--- /dev/null
+++ b/includes/transforms/AddMobileTocTransform.php
@@ -0,0 +1,23 @@
+<?php
+
+namespace MobileFrontend\Transforms;
+
+use DOMElement;
+
+class AddMobileTocTransform implements IMobileTransform {
+       /**
+        * Insert a table of content placeholder into the element
+        * which will be progressively enhanced via JS
+        * @param DOMElement $node to be transformed
+        */
+       public function apply( DOMElement $node ) {
+               $doc = $node->ownerDocument;
+               // Insert table of content placeholder which will be 
progressively enhanced via JS
+               $toc = $doc->createElement( 'div' );
+               $toc->setAttribute( 'id', 'toc' );
+               $toc->setAttribute( 'class', 'toc-mobile' );
+               $tocHeading = $doc->createElement( 'h2', wfMessage( 'toc' 
)->text() );
+               $toc->appendChild( $tocHeading );
+               $node->appendChild( $toc );
+       }
+}
diff --git a/includes/transforms/IMobileTransform.php 
b/includes/transforms/IMobileTransform.php
new file mode 100644
index 0000000..8de08bb
--- /dev/null
+++ b/includes/transforms/IMobileTransform.php
@@ -0,0 +1,13 @@
+<?php
+
+namespace MobileFrontend\Transforms;
+
+use DOMElement;
+
+interface IMobileTransform {
+       /**
+        * Transforms the DOMElement in some way
+        * @param DOMElement $node to be transformed
+        */
+       public function apply( DOMElement $node );
+}
diff --git a/includes/transforms/MoveLeadParagraphTransform.php 
b/includes/transforms/MoveLeadParagraphTransform.php
new file mode 100644
index 0000000..82f7a71
--- /dev/null
+++ b/includes/transforms/MoveLeadParagraphTransform.php
@@ -0,0 +1,112 @@
+<?php
+
+namespace MobileFrontend\Transforms;
+
+use DOMXPath;
+use MobileContext;
+use DOMElement;
+
+class MoveLeadParagraphTransform implements IMobileTransform {
+       /**
+        * @param string $title for logging purposes
+        * @param number $revId for logging purposes
+        */
+       public function __construct( $title, $revId ) {
+               $this->title = $title;
+               $this->revId = $revId;
+       }
+
+       /**
+        * Rearranges content so that text in the lead paragraph is prioritised 
to appear
+        * before the infobox
+        * @param DOMElement $node to be transformed
+        */
+       public function apply( DOMElement $node ) {
+               $this->moveFirstParagraphBeforeInfobox( $node, 
$node->ownerDocument );
+       }
+
+       /**
+        * Move the first paragraph in the lead section above the infobox
+        *
+        * In order for a paragraph to be moved the following conditions must 
be met:
+        *   - the lead section contains at least one infobox;
+        *   - the paragraph doesn't already appear before the first infobox
+        *     if any in the DOM;
+        *   - the paragraph contains text content, e.g. no <p></p>;
+        *   - the paragraph doesn't contain coordinates, i.e. span#coordinates.
+        *   - article belongs to the MAIN namespace
+        *
+        * Additionally if paragraph immediate sibling is a list (ol or ul 
element), the list
+        * is also moved along with paragraph above infobox.
+        *
+        * Note that the first paragraph is not moved before hatnotes, or mbox 
or other
+        * elements that are not infoboxes.
+        *
+        * @param DOMElement $leadSectionBody
+        * @param DOMDocument $doc Document to which the section belongs
+        */
+       private function moveFirstParagraphBeforeInfobox( $leadSectionBody, 
$doc ) {
+               $xPath = new DOMXPath( $doc );
+               // Find infoboxes and paragraphs that have text content, i.e. 
paragraphs
+               // that are not empty nor are wrapper paragraphs that contain 
span#coordinates.
+               $infoboxAndParagraphs = $xPath->query(
+                       './table[contains(@class,"infobox")] | 
./p[string-length(text()) > 0]',
+                       $leadSectionBody
+               );
+               // We need both an infobox and a paragraph and the first 
element of our query result
+               // ought to be an infobox.
+               if ( $infoboxAndParagraphs->length >= 2 &&
+                       $infoboxAndParagraphs->item( 0 )->nodeName == 'table'
+               ) {
+                       $firstP = null;
+                       for ( $i = 1; $i < $infoboxAndParagraphs->length; $i++ 
) {
+                               if ( $infoboxAndParagraphs->item( $i 
)->nodeName == 'p' ) {
+                                       $firstP = $infoboxAndParagraphs->item( 
$i );
+                                       break;
+                               }
+                       }
+                       if ( $firstP ) {
+                               $listElementAfterParagraph = null;
+                               $where = $infoboxAndParagraphs->item( 0 );
+
+                               $elementAfterParagraphQuery = $xPath->query( 
'following-sibling::*[1]', $firstP );
+                               if ( $elementAfterParagraphQuery->length > 0 ) {
+                                       $elem = 
$elementAfterParagraphQuery->item( 0 );
+                                       if ( $elem->tagName === 'ol' || 
$elem->tagName === 'ul' ) {
+                                               $listElementAfterParagraph = 
$elem;
+                                       }
+                               }
+
+                               $leadSectionBody->insertBefore( $firstP, $where 
);
+                               if ( $listElementAfterParagraph !== null ) {
+                                       $leadSectionBody->insertBefore( 
$listElementAfterParagraph, $where );
+                               }
+                       }
+               }
+               /**
+                * @see https://phabricator.wikimedia.org/T149884
+                * @todo remove after research is done
+                */
+               if ( MobileContext::singleton()->getMFConfig()->get( 
'MFLogWrappedInfoboxes' ) ) {
+                       $this->logInfoboxesWrappedInContainers( 
$leadSectionBody, $xPath );
+               }
+       }
+
+       /**
+        * Finds all infoboxes which are one or more levels deep in $xPath 
content. When at least one
+        * element is found - log the page title and revision
+        *
+        * @see https://phabricator.wikimedia.org/T149884
+        * @param $leadSectionBody
+        * @param DOMXPath $xPath
+        */
+       private function logInfoboxesWrappedInContainers( $leadSectionBody, 
DOMXPath $xPath ) {
+               $infoboxes = $xPath->query( 
'./*//table[contains(@class,"infobox")]' .
+                       '[not(ancestor::table[contains(@class,"infobox")])]', 
$leadSectionBody );
+               if ( $infoboxes->length > 0 ) {
+                       \MediaWiki\Logger\LoggerFactory::getInstance( 'mobile' 
)->info(
+                               "Found infobox wrapped with container on 
{$this->title} (rev:{$this->revId})"
+                       );
+               }
+       }
+}
diff --git a/includes/transforms/NoTransform.php 
b/includes/transforms/NoTransform.php
new file mode 100644
index 0000000..44b414c
--- /dev/null
+++ b/includes/transforms/NoTransform.php
@@ -0,0 +1,15 @@
+<?php
+
+namespace MobileFrontend\Transforms;
+
+use DOMElement;
+
+class NoTransform implements IMobileTransform {
+       /**
+        * Does nothing.
+        * @param DOMElement $node to be transformed
+        */
+       public function apply( DOMElement $node ) {
+               // nothing happens
+       }
+}
diff --git a/tests/phpunit/MobileFormatterTest.php 
b/tests/phpunit/MobileFormatterTest.php
index 702d0c6..0ba00eb 100644
--- a/tests/phpunit/MobileFormatterTest.php
+++ b/tests/phpunit/MobileFormatterTest.php
@@ -978,30 +978,6 @@
         * @see https://phabricator.wikimedia.org/T163805
         * @covers MobileFormatter::filterContent
         */
-       public function testLoggingOfInfoboxesLogsOnlyMainNamespace() {
-               $this->setMwGlobals( [
-                       'wgMFLogWrappedInfoboxes' => true
-               ] );
-
-               $input = '<div>'. $this->buildInfoboxHTML( 'test' ).'</div>';
-               $title = 'Special:T163805';
-
-               $formatter = new MobileFormatter( MobileFormatter::wrapHTML( 
$input ),
-                       Title::newFromText( $title,  NS_SPECIAL ) );
-               $formatter->enableExpandableSections();
-
-               $loggerMock = $this->getMock( \Psr\Log\LoggerInterface::class );
-               $loggerMock->expects( $this->never() )
-                       ->method( 'info' );
-
-               $this->setLogger( 'mobile', $loggerMock );
-               $formatter->filterContent( false, false, false, true );
-       }
-
-       /**
-        * @see https://phabricator.wikimedia.org/T163805
-        * @covers MobileFormatter::filterContent
-        */
        public function testLoggingOfInfoboxesSkipsInfoBoxInsideInfobox() {
                $this->setMwGlobals( [
                        'wgMFLogWrappedInfoboxes' => true
diff --git a/tests/phpunit/transforms/MoveLeadParagraphTransformTest.php 
b/tests/phpunit/transforms/MoveLeadParagraphTransformTest.php
new file mode 100644
index 0000000..dbeb887
--- /dev/null
+++ b/tests/phpunit/transforms/MoveLeadParagraphTransformTest.php
@@ -0,0 +1,39 @@
+<?php
+
+use MobileFrontend\Transforms\MoveLeadParagraphTransform;
+
+/**
+ * @group MobileFrontend
+ */
+class MoveLeadParagraphTransformTest extends MediaWikiTestCase {
+       public static function wrap( $html ) {
+               return "<!DOCTYPE HTML>
+<html><body>$html</body></html>
+";
+       }
+       /**
+        * @dataProvider provideTransform
+        *
+        * @param string $html
+        * @param string $expected
+        */
+       public function testTransform( $html, $expected ) {
+               $transform = new MoveLeadParagraphTransform( 'A', 1 );
+               $doc = new DOMDocument();
+               $doc->loadHTML( self::wrap( $html ) );
+               $transform->apply( $doc->getElementsByTagName( 'body' )->item( 
0 ) );
+               $this->assertEquals( $doc->saveHTML(), self::wrap( $expected ) 
);
+       }
+
+       public function provideTransform() {
+               $infobox = '<table class="infobox"></table>';
+               $paragraph = '<p>first paragraph</p>';
+
+               return [
+                       [
+                               "$infobox$paragraph",
+                               "$paragraph$infobox",
+                       ]
+               ];
+       }
+}

-- 
To view, visit https://gerrit.wikimedia.org/r/401776
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1fb2dc9e7ad1fe0431819ce3ca6295b3df8410e0
Gerrit-PatchSet: 13
Gerrit-Project: mediawiki/extensions/MobileFrontend
Gerrit-Branch: master
Gerrit-Owner: Jdlrobson <jrob...@wikimedia.org>
Gerrit-Reviewer: Jdlrobson <jrob...@wikimedia.org>
Gerrit-Reviewer: Niedzielski <sniedziel...@wikimedia.org>
Gerrit-Reviewer: Pmiazga <pmia...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to