GWicke has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/81711


Change subject: Don't strip trailing newlines and space on tokens from 
sub-pipelines
......................................................................

Don't strip trailing newlines and space on tokens from sub-pipelines

The stripEOFTkfromTokens method used to strip trailing newlines and space-only
string tokens from sub-pipelines in addition to the expected EOFTk. This might
have been a work-around for missing paragraph suppression and possibly missing
stripping in other places, but does not seem to be needed any more.

It did create non-deterministic output when a trailing space-only token was
emitted from a sub-pipeline. The grouping of tokens returned from the
sub-pipeline depended on the order of template expansion, so the space would
only be stripped in some execution orders.

I have created a (currently unused) replacement method
stripTrailingNewlinesFromTokens that does the newline stripping in case that
still turns out to be necessary. We should remove it when we are satisfied
that the stripping is really not needed.

Change-Id: I83de5e87387d4d629eece946633e91b8e0f30854
---
M js/lib/mediawiki.TokenTransformManager.js
M js/lib/mediawiki.Util.js
2 files changed, 30 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/mediawiki/extensions/Parsoid 
refs/changes/11/81711/1

diff --git a/js/lib/mediawiki.TokenTransformManager.js 
b/js/lib/mediawiki.TokenTransformManager.js
index fec8350..0eba1f7 100644
--- a/js/lib/mediawiki.TokenTransformManager.js
+++ b/js/lib/mediawiki.TokenTransformManager.js
@@ -327,7 +327,7 @@
  * @param {Object} ret The chunk we're returning from the transform.
  */
 AsyncTokenTransformManager.prototype.emitChunk = function( ret ) {
-       this.env.dp( 'emitChunk', ret );
+       this.env.dp( 'AsyncTokenTransformManager.emitChunk', ret );
 
        function checkForEOFTkErrors(ttm, ret, atEnd) {
                if ( ttm.frame.depth === 0 &&
diff --git a/js/lib/mediawiki.Util.js b/js/lib/mediawiki.Util.js
index 845bd4a..fa3fa89 100644
--- a/js/lib/mediawiki.Util.js
+++ b/js/lib/mediawiki.Util.js
@@ -645,7 +645,7 @@
                return tokens;
        },
 
-       // Strip 'end' tokens and trailing newlines
+       // Strip EOFTk tokens
        stripEOFTkfromTokens: function ( tokens ) {
                // this.dp( 'stripping end or whitespace tokens' );
                if ( tokens.constructor !== Array ) {
@@ -654,28 +654,41 @@
                if ( ! tokens.length ) {
                        return tokens;
                }
-               // Strip 'end' tokens and trailing newlines
-               var l = tokens[tokens.length - 1];
-               if ( l &&
-                    ( l.constructor === pd.EOFTk ||
-                      l.constructor === pd.NlTk ||
-                               ( l.constructor === String && l.match( /^\s+$/ 
) ) ) ) {
-                       var origTokens = tokens;
-                       tokens = origTokens.slice();
-                       tokens.rank = origTokens.rank;
-                       while ( tokens.length &&
-                               (( l.constructor === pd.EOFTk  ||
-                                  l.constructor === pd.NlTk )  ||
-                                ( l.constructor === String && l.match( /^\s+$/ 
) ) ) )
-                       {
+               // Strip 'end' tokens
+               if ( tokens.length && tokens.last().constructor === pd.EOFTk ) {
+                       // clone tokens
+                       var rank = tokens.rank;
+                       tokens = tokens.slice();
+                       tokens.rank = rank;
+                       while ( tokens.length && tokens.last().constructor === 
pd.EOFTk ) {
                                // this.dp( 'stripping end or whitespace 
tokens' );
                                tokens.pop();
-                               l = tokens[tokens.length - 1];
                        }
                }
                return tokens;
        },
 
+       // Strip NlTk and ws-only trailing text tokens. Used to be part of
+       // stripEOFTkfromTokens, but unclear if this is still needed.
+       // TODO: remove this if this is not needed any more!
+       stripTrailingNewlinesFromTokens: function (tokens) {
+               var token = tokens.last(),
+                       lastMatches = function(toks) {
+                               var lastTok = toks.last();
+                               return lastTok && (
+                                               lastTok.constructor === pd.NlTk 
||
+                                               lastTok.constructor === String 
&& /^\s+$/.test(token));
+                       };
+               if (lastMatches) {
+                       tokens = tokens.slice();
+               }
+               while (lastMatches)
+               {
+                       tokens.pop();
+               }
+               return tokens;
+       },
+
        /**
         * Perform a shallow clone of a chunk of tokens
         */

-- 
To view, visit https://gerrit.wikimedia.org/r/81711
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I83de5e87387d4d629eece946633e91b8e0f30854
Gerrit-PatchSet: 1
Gerrit-Project: mediawiki/extensions/Parsoid
Gerrit-Branch: master
Gerrit-Owner: GWicke <gwi...@wikimedia.org>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to