(freemarker) branch 2.3-gae updated: FREEMARKER-219: The truncate family of built-ins, as in maybeLong?truncate(10, ''), if the terminator string is set to 0 length, now it will not add a space before the terminator string when the cut happened exactly after the end of a word. Also, improved truncate-related documentation.

ddekany Sat, 23 Dec 2023 13:05:27 -0800

This is an automated email from the ASF dual-hosted git repository.

ddekany pushed a commit to branch 2.3-gae
in repository https://gitbox.apache.org/repos/asf/freemarker.git



The following commit(s) were added to refs/heads/2.3-gae by this push:
     new 6334550b FREEMARKER-219: The truncate family of built-ins, as in 
maybeLong?truncate(10, ''), if the terminator string is set to 0 length, now it 
will not add a space before the terminator string when the cut happened exactly 
after the end of a word. Also, improved truncate-related documentation.
6334550b is described below

commit 6334550b3ff136530398459e8f4a9b2d3da57684
Author: ddekany <[email protected]>
AuthorDate: Sat Dec 23 21:30:33 2023 +0100

    FREEMARKER-219: The truncate family of built-ins, as in 
maybeLong?truncate(10, ''), if the terminator string is set to 0 length, now it 
will not add a space before the terminator string when the cut happened exactly 
after the end of a word. Also, improved truncate-related documentation.
---
 .../main/java/freemarker/core/Configurable.java    |  6 +-
 .../core/DefaultTruncateBuiltinAlgorithm.java      | 33 +++++-----
 .../java/freemarker/core/TruncateBuiltInTest.java  | 21 +++++-
 freemarker-manual/src/main/docgen/en_US/book.xml   | 76 ++++++++++++++++++----
 4 files changed, 102 insertions(+), 34 deletions(-)

diff --git a/freemarker-core/src/main/java/freemarker/core/Configurable.java 
b/freemarker-core/src/main/java/freemarker/core/Configurable.java
index 18820256..fc98db58 100644
--- a/freemarker-core/src/main/java/freemarker/core/Configurable.java
+++ b/freemarker-core/src/main/java/freemarker/core/Configurable.java
@@ -1703,10 +1703,10 @@ public class Configurable {
     }
 
     /**
-     * Specifies the algorithm used for {@code ?truncate}. Defaults to
+     * Specifies the algorithm used for {@code ?truncate}, {@code 
?truncate_w}, and {@code ?truncate_c}. Defaults to
      * {@link DefaultTruncateBuiltinAlgorithm#ASCII_INSTANCE}. Most 
customization needs can be addressed by
-     * creating a new {@link DefaultTruncateBuiltinAlgorithm} with the proper 
constructor parameters. Otherwise users
-     * my use their own {@link TruncateBuiltinAlgorithm} implementation.
+     * creating a new {@link DefaultTruncateBuiltinAlgorithm} with the proper 
constructor parameters. Otherwise, users
+     * may use their own {@link TruncateBuiltinAlgorithm} implementation.
      *
      * <p>In case you need to set this with {@link Properties}, or a similar 
configuration approach that doesn't let you
      * create the value in Java, see examples at {@link #setSetting(String, 
String)}.
diff --git 
a/freemarker-core/src/main/java/freemarker/core/DefaultTruncateBuiltinAlgorithm.java
 
b/freemarker-core/src/main/java/freemarker/core/DefaultTruncateBuiltinAlgorithm.java
index 99625b5c..1a0eda79 100644
--- 
a/freemarker-core/src/main/java/freemarker/core/DefaultTruncateBuiltinAlgorithm.java
+++ 
b/freemarker-core/src/main/java/freemarker/core/DefaultTruncateBuiltinAlgorithm.java
@@ -143,9 +143,9 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
      * @param defaultTerminator
      *            The terminator to use if the invocation (like {@code 
s?truncate(20)}) doesn't specify it. The
      *            terminator is the text appended after a truncated string, to 
indicate that it was truncated.
-     *            Typically it's {@code "[...]"} or {@code "..."}, or the same 
with UNICODE ellipsis character.
+     *            Typically, it's {@code "[...]"} or {@code "..."}, or the 
same with UNICODE ellipsis character.
      * @param defaultTerminatorLength
-     *            The assumed length of {@code defaultTerminator}, or {@code 
null} if it should be get via
+     *            The assumed length of {@code defaultTerminator}, or {@code 
null} if the assumed length is simply
      *            {@code defaultTerminator.length()}.
      * @param defaultTerminatorRemovesDots
      *            Whether dots and ellipsis characters that the {@code 
defaultTerminator} touches should be removed. If
@@ -157,8 +157,12 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
      *            in which case {@code defaultTerminator} will be used even if 
{@code ?truncate_m} or similar built-in
      *            is called.
      * @param defaultMTerminatorLength
-     *            The assumed length of the terminator, or {@code null} if it 
should be get via
-     *            {@link #getMTerminatorLength}.
+     *            The assumed length of the terminator, or {@code null} if the 
assumed length will be
+     *            {@link #getMTerminatorLength(TemplateMarkupOutputModel)}. 
Note that if you have HTML tags, or entity
+     *            references in the {@code defaultMTerminator}, then the 
visual length differs from the string length,
+     *            and {@link #getMTerminatorLength(TemplateMarkupOutputModel)} 
accounts for these complications to an
+     *            extent, but it for example it won't know what CSS does, or 
if the nested content of some HTML elements
+     *            are not displayed.
      * @param defaultMTerminatorRemovesDots
      *            Similar to {@code defaultTerminatorRemovesDots}, but for 
{@code defaultMTerminator}. If {@code
      *            null}, and {@code defaultMTerminator} is HTML/XML/XHTML, 
then it will be examined of the
@@ -168,17 +172,17 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
      * @param addSpaceAtWordBoundary,
      *            Whether to add a space before the terminator if the 
truncation happens directly after the end of a
      *            word. For example, when "too long sentence" is truncated, it 
will be a like "too long [...]"
-     *            instead of "too long[...]". When the truncation happens 
inside a word, this has on effect, i.e., it
+     *            instead of "too long[...]". When the truncation happens 
inside a word, this has no effect, i.e., it
      *            will be always like "too long se[...]" (no space before the 
terminator). Note that only whitespace is
      *            considered to be a word separator, not punctuation, so if 
this is {@code true}, you get results
      *            like "Some sentence. [...]".
      * @param wordBoundaryMinLength
      *            Used when {@link #truncate} or {@link #truncateM} has to 
decide between
-     *            word boundary truncation and character boundary truncation; 
it's the minimum length, given as
+     *            word boundary truncation, and character boundary truncation; 
it's the minimum length, given as
      *            proportion of {@code maxLength}, that word boundary 
truncation has to produce. If the resulting
      *            length is less, we do character boundary truncation instead. 
For example, if {@code maxLength} is
-     *            30, and this parameter is 0.85, then: 30*0.85 = 25.5, 
rounded up that's 26, so the resulting length
-     *            must be at least 26. The result of character boundary 
truncation will be always accepted, even if its
+     *            30, and this parameter is 0.85, then: 30 * 0.85 = 25.5, 
rounded up that's 26, so the resulting length
+     *            must be at least 26. The result of character boundary 
truncation will always be accepted, even if it's
      *            still too short. If this parameter is {@code null}, then 
{@link #DEFAULT_WORD_BOUNDARY_MIN_LENGTH}
      *            will be used. If this parameter is 0, then truncation always 
happens at word boundary. If this
      *            parameter is 1.0, then truncation doesn't prefer word 
boundaries over other places.
@@ -355,7 +359,7 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
      *
      * <p>In the implementation in {@link DefaultTruncateBuiltinAlgorithm}, if 
the markup is HTML/XML/XHTML, then this
      * counts the characters outside tags and comments, and inside CDATA 
sections (ignoring the CDATA section
-     * delimiters). Furthermore then it counts character and entity references 
as having length of 1. If the markup
+     * delimiters). Furthermore, then it counts character and entity 
references as having length of 1. If the markup
      * is not HTML/XML/XHTML (or subclasses of those {@link 
MarkupOutputFormat}-s) then it doesn't know how to
      * measure it, and simply returns 3.
      */
@@ -394,9 +398,6 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
                 : true;
     }
 
-    /**
-     * Deals with both CB and WB truncation, hence it's unified.
-     */
     private TemplateModel unifiedTruncate(
             String s, int maxLength,
             TemplateModel terminator, Integer terminatorLength,
@@ -437,7 +438,7 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
                 terminator, terminatorLength, terminatorRemovesDots,
                 mode);
 
-        // The terminator is always shown, even if with that we exceed 
maxLength. Otherwise the user couldn't
+        // The terminator is always shown, even if with that we exceed 
maxLength. Otherwise, the user couldn't
         // see that the string was truncated.
         if (truncatedS == null || truncatedS.length() == 0) {
             return terminator;
@@ -447,7 +448,7 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
             truncatedS.append(((TemplateScalarModel) 
terminator).getAsString());
             return new SimpleScalar(truncatedS.toString());
         } else if (terminator instanceof TemplateMarkupOutputModel) {
-            TemplateMarkupOutputModel markup = (TemplateMarkupOutputModel) 
terminator;
+            TemplateMarkupOutputModel markup = (TemplateMarkupOutputModel<?>) 
terminator;
             MarkupOutputFormat outputFormat = markup.getOutputFormat();
             return 
outputFormat.concat(outputFormat.fromPlainTextByEscaping(truncatedS.toString()),
 markup);
         } else {
@@ -470,6 +471,8 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
             return null;
         }
 
+        boolean addSpaceAtWordBoundary = this.addSpaceAtWordBoundary && 
terminatorLength != 0;
+
         if (mode == TruncationMode.AUTO && wordBoundaryMinLength < 1.0 || mode 
== TruncationMode.WORD_BOUNDARY) {
             // Do word boundary truncation. Might not be possible due to 
minLength restriction (see below), in which
             // case truncedS stays null.
@@ -527,7 +530,7 @@ public class DefaultTruncateBuiltinAlgorithm extends 
TruncateBuiltinAlgorithm {
 
         // If the truncation point is a word boundary, and thus we add a space 
before the terminator, then we may run
         // out of the maxLength by 1. In that case we have to truncate one 
character earlier.
-        if (cbLastCIdx == cbInitialLastCIdx && addSpaceAtWordBoundary  && 
isWordEnd(s, cbLastCIdx)) {
+        if (cbLastCIdx == cbInitialLastCIdx && addSpaceAtWordBoundary && 
isWordEnd(s, cbLastCIdx)) {
             cbLastCIdx--;
             if (cbLastCIdx < 0) {
                 return null;
diff --git 
a/freemarker-core/src/test/java/freemarker/core/TruncateBuiltInTest.java 
b/freemarker-core/src/test/java/freemarker/core/TruncateBuiltInTest.java
index 2cfe037f..9929151c 100644
--- a/freemarker-core/src/test/java/freemarker/core/TruncateBuiltInTest.java
+++ b/freemarker-core/src/test/java/freemarker/core/TruncateBuiltInTest.java
@@ -71,7 +71,8 @@ public class TruncateBuiltInTest extends TemplateTest {
 
     @Test
     public void testTruncateM() throws IOException, TemplateException {
-        assertOutput("${t?truncateM(15)}", "Some text <span 
class='truncateTerminator'>[&#8230;]</span>"); // String arg allowed...
+        assertOutput("${t?truncateM(15)}",
+                "Some text <span 
class='truncateTerminator'>[&#8230;]</span>"); // String arg allowed...
         assertOutput("${t?truncate_m(15, mTerm)}", "Some text for " + 
M_TERM_SRC);
         assertOutput("${t?truncateM(15, mTerm)}", "Some text for " + 
M_TERM_SRC);
         assertOutput("${t?truncateM(15, mTerm, 3)}", "Some text " + 
M_TERM_SRC);
@@ -150,4 +151,20 @@ public class TruncateBuiltInTest extends TemplateTest {
         assertOutput("${t?truncateM(20)}", "Some text for " + M_TERM_SRC);
     }
 
-}
+    @Test
+    public void testJiraIssueFREEMARKER219() throws IOException, 
TemplateException {
+        assertOutput("${'1 3'?truncate_c(2, '|')}", "|");
+        assertOutput("${' 2 '?truncate_c(2, '|')}", "|");
+        assertOutput("${'1 '?truncate_c(1, '|')}", "|");
+        assertOutput("${' 2'?truncate_c(1, '|')}", "|");
+        assertOutput("${'1234 SOMESTREETSSS AVE NE 123'?truncate_c(25, '|')}", 
"1234 SOMESTREETSSS AVE N|");
+
+        assertOutput("${'1 3'?truncate_c(2, '')}", "1");
+        assertOutput("${' 2 '?truncate_c(2, '')}", " 2");
+        assertOutput("${'1 '?truncate_c(1, '')}", "1");
+        assertOutput("${' 2'?truncate_c(1, '')}", "");
+        assertOutput("${'1234 SOMESTREETSSS AVE NE 123'?truncate_c(25, '')}", 
"1234 SOMESTREETSSS AVE NE");
+        assertOutput("${'1234 SOMESTREETSSS AVE NE 123'?truncate_c(24, '')}", 
"1234 SOMESTREETSSS AVE N");
+        assertOutput("${'1234 SOMESTREETSSS AVE NE 123'?truncate_c(23, '')}", 
"1234 SOMESTREETSSS AVE");
+    }
+}
\ No newline at end of file
diff --git a/freemarker-manual/src/main/docgen/en_US/book.xml 
b/freemarker-manual/src/main/docgen/en_US/book.xml
index fa066671..fce81f36 100644
--- a/freemarker-manual/src/main/docgen/en_US/book.xml
+++ b/freemarker-manual/src/main/docgen/en_US/book.xml
@@ -15115,11 +15115,27 @@ foobar</programlisting>
             <primary>truncate_w_m built-in</primary>
           </indexterm>
 
+          <note>
+            <para>If you just want to limit the length of string with
+            straightforward behavior, then do not use this built in, but the
+            <link linkend="dgui_template_exp_seqenceop_slice">sequence
+            slicing</link>, and <link
+            linkend="dgui_template_exp_direct_ranges">..* length limited
+            range</link> operators. For example, <literal>s[0 ..*
+            10]</literal> will give the first 10 characters of
+            <literal>s</literal>, if <literal>s</literal> is longer than that,
+            otherwise it just gives <literal>s</literal> as is. While
+            <literal>s?truncate(10, '')</literal> expresses similar intent, it
+            has complicated rules to give a result that looks nicer for
+            humans, like it trims the right side at the cut, and sometimes
+            cuts a bit early to avoid cutting into the last word.</para>
+          </note>
+
           <para>Cuts off the end of a string if that's necessary to keep it
-          under a the length given as parameter, and appends a terminator
-          string (<literal>[...]</literal> by default) to indicate that the
-          string was truncated. Example (assuming default FreeMarker
-          configuration settings):</para>
+          under the length given as parameter, and appends a terminator string
+          (<literal>[...]</literal> by default) to indicate that the string
+          was truncated. Example (assuming default FreeMarker configuration
+          settings):</para>
 
           <programlisting role="template">&lt;#assign shortName='This is 
short'&gt;
 &lt;#assign longName='This is a too long name'&gt;
@@ -15143,7 +15159,7 @@ This is a [...]
 Truncated at "character boundary":
 This isonev[...]</programlisting>
 
-          <para>Things to note above:</para>
+          <para>Notes on some tricky aspects for truncation:</para>
 
           <itemizedlist>
             <listitem>
@@ -15159,9 +15175,9 @@ This isonev[...]</programlisting>
               better look (see later). Actually, the result length can also be
               longer than the parameter length, when the desired length is
               shorter than the terminator string alone, in which case the
-              terminator is still returned as is. Also, an algorithms other
+              terminator is still returned as is. Also, an algorithm other
               than the default might choses to return a longer string, as the
-              length parameter is in principle just hint for the desired
+              length parameter is in principle just a hint for the desired
               visual length.</para>
             </listitem>
 
@@ -15180,7 +15196,21 @@ This isonev[...]</programlisting>
               between the word end and the terminator string, otherwise
               there's no space between them. Only whitespace is treated as
               word separator, not punctuation, so this generally gives
-              intuitive results.</para>
+              intuitive results. (Except, if the terminator string is set to
+              be 0 length, no space is added before it, starting from
+              FreeMarker 2.3.33.)</para>
+            </listitem>
+
+            <listitem>
+              <para>Before adding the terminator string (possibly with a word
+              boundary space before it, as explained above) after the string
+              whose length was already cut, trailing whitespace is removed
+              from that. For example <literal>'1
+              67890A'?truncate(10)</literal>, where there are 4 spaces between
+              the <literal>1</literal> and <literal>6</literal>, will give
+              <quote><literal>1 [...]</literal></quote> (7 characters), not
+              <quote><literal>1 [...]</literal></quote> (10
+              characters).</para>
             </listitem>
           </itemizedlist>
 
@@ -15220,7 +15250,11 @@ This isonev[...]</programlisting>
                     to give a string length closer to the length specified,
                     but still not an exact length, as it removes white-space
                     before the terminator string, and re-adds a space if we
-                    are just after the end of a word, etc.</para>
+                    are just after the end of a word, etc. (Except, space is
+                    not re-added if the terminator string is set to be 0
+                    length, starting from FreeMarker 2.3.33.) If you need
+                    exact length, simply use <literal>longName[0 ..*
+                    16]</literal>.</para>
                   </listitem>
                 </itemizedlist>
               </listitem>
@@ -15229,11 +15263,11 @@ This isonev[...]</programlisting>
                 <para>Specifying the terminator string (instead of relying on
                 its default): <literal>truncate</literal> and all
                 <literal>truncate_<replaceable>...</replaceable></literal>
-                built-ins have an additional optional parameter for it. After
-                that, a further optional parameter can specify the assumed
-                length of the terminator string (otherwise its real length
-                will be used). If you find yourself specifying the terminator
-                string often, then certainly the defaults should be configured
+                built-ins have an optional 2nd parameter for that. After that,
+                a further optional parameter can specify the assumed length of
+                the terminator string (otherwise its real length will be
+                used). If you find yourself specifying the terminator string
+                often, then certainly the defaults should be configured
                 instead (via <literal>truncate_builtin_algorithm
                 configuration</literal> - see earlier). Example:</para>
 
@@ -30141,6 +30175,20 @@ TemplateModel x = env.getVariable("x");  // get 
variable x</programlisting>
               </itemizedlist>
             </listitem>
 
+            <listitem>
+              <para><link
+              
xlink:href="https://issues.apache.org/jira/browse/FREEMARKER-219";>FREEMARKER-219</link>:
+              The <link linkend="ref_builtin_truncate"><quote>truncate</quote>
+              family of built-ins</link>, as in
+              <literal>maybeLong?truncate(10, '')</literal>, if the terminator
+              string is set to 0 length, now it will not add a space before
+              the terminator string when the cut happened exactly after the
+              end of a word. (Note that if you are using something like
+              <literal>maybeLong?truncate_c(10, '')</literal>, then certainly
+              what you really want is <literal>maybeLong[0 ..* 10]</literal>,
+              as that doesn't do trimming at the cut.)</para>
+            </listitem>
+
             <listitem>
               <para><link
               xlink:href="https://github.com/apache/freemarker/pull/89";>GitHub

Reply via email to