[ 
https://issues.apache.org/jira/browse/JAMES-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862480#comment-17862480
 ] 

aleksey edited comment on JAMES-4049 at 7/2/24 3:24 PM:
--------------------------------------------------------

Dear, [~btellier], please advise me on the best way to proceed. When working 
with Preview, I encountered that ` ` are not processed. For example, if 
you take an
[^Message17199066550193266373.eml]
message and process it using, `new Factory(new MessageContentExtractor(), new 
JsoupHtmlTextExtractor())`, you will get a string similar to 'Complete your 
daily lesson, improve your English. NBSP NBSP NBSP NBSP NBSP ... NBSP' in 
response. It forces me to manually delete unnecessary characters.
{code:java}
import lombok.NonNull;
import lombok.extern.slf4j.Slf4j;
import org.apache.james.jmap.draft.utils.JsoupHtmlTextExtractor;
import org.apache.james.mime4j.dom.Message;
import org.apache.james.util.html.HtmlTextExtractor;
import org.apache.james.util.mime.MessageContentExtractor;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.Optional;
import java.util.function.Predicate;

import static org.apache.james.jmap.api.model.Preview.Factory;

@Slf4j
@Service
public class PreviewServiceImpl implements PreviewService {

    private final Factory factory;

    public PreviewServiceImpl() {
        MessageContentExtractor messageContentExtractor = new 
MessageContentExtractor();
        HtmlTextExtractor htmlTextExtractor = new JsoupHtmlTextExtractor();
        this.factory = new Factory(messageContentExtractor, htmlTextExtractor);
    }

    @Override
    public Optional<Preview> getPreview(@NonNull Message message) {
        return Optional.of(message)
                .map(this::extractCandidateToPreview)
                .map(this::removeZeroWidthNonJoiner)
                .map(String::strip)
                .filter(Predicate.not(String::isEmpty))
                .map(Preview::new);
    }

    private String extractCandidateToPreview(Message message) {
        try {
            return factory.fromMime4JMessage(message).getValue();
        } catch (IOException e) {
            log.warn("Failed to extract preview from message", e);
            return "";
        }
    }

    /**
     * Removes ZERO WIDTH NON-JOINER characters
     *
     * @param text Text to check
     * @return Checked text
     */
    private String removeZeroWidthNonJoiner(String text) {
        return text.replaceAll("\\u200C", "");
    }
} {code}
 

Do you think this is an issue with the Preview class or the 
JsoupHtmlTextExtractor class? Based on your response, I will open another issue 
for improvement.

 


was (Author: JIRAUSER305652):
Dear, [~btellier], please advise me on the best way to proceed. When working 
with Preview, I encountered that &nbsp; are not processed. For example, if you 
take an
[^Message17199066550193266373.eml]
message and process it using, `new Factory(new MessageContentExtractor(), new 
JsoupHtmlTextExtractor())`, you will get a string similar to 'Complete your 
daily lesson, improve your English. NBSP NBSP NBSP NBSP NBSP ... NBSP' in 
response. It forces me to manually delete unnecessary characters.
{code:java}
import lombok.NonNull;
import lombok.extern.slf4j.Slf4j;
import org.apache.james.jmap.draft.utils.JsoupHtmlTextExtractor;
import org.apache.james.mime4j.dom.Message;
import org.apache.james.util.html.HtmlTextExtractor;
import org.apache.james.util.mime.MessageContentExtractor;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.Optional;
import java.util.function.Predicate;

import static org.apache.james.jmap.api.model.Preview.Factory;

@Slf4j
@Service
public class PreviewServiceImpl implements PreviewService {

    private final Factory factory;

    public PreviewServiceImpl() {
        MessageContentExtractor messageContentExtractor = new 
MessageContentExtractor();
        HtmlTextExtractor htmlTextExtractor = new JsoupHtmlTextExtractor();
        this.factory = new Factory(messageContentExtractor, htmlTextExtractor);
    }

    @Override
    public Optional<Preview> getPreview(@NonNull Message message) {
        return Optional.of(message)
                .map(this::extractCandidateToPreview)
                .map(this::removeZeroWidthNonJoiner)
                .map(String::strip)
                .filter(Predicate.not(String::isEmpty))
                .map(Preview::new);
    }

    private String extractCandidateToPreview(Message message) {
        try {
            return factory.fromMime4JMessage(message).getValue();
        } catch (IOException e) {
            log.warn("Failed to extract preview from message", e);
            return "";
        }
    }

    /**
     * Removes ZERO WIDTH NON-JOINER characters
     *
     * @param text Text to check
     * @return Checked text
     */
    private String removeZeroWidthNonJoiner(String text) {
        return text.replaceAll("\\u200C", "");
    }
} {code}
 

Do you think this is an issue with the Preview class or the 
JsoupHtmlTextExtractor class? Based on your response, I will open another issue 
for improvement.

 

> Сonfigurable length of the preview value
> ----------------------------------------
>
>                 Key: JAMES-4049
>                 URL: https://issues.apache.org/jira/browse/JAMES-4049
>             Project: James Server
>          Issue Type: Improvement
>          Components: data
>    Affects Versions: 3.6.0
>            Reporter: aleksey
>            Priority: Minor
>         Attachments: Message17199066550193266373.eml
>
>
> Hello, dear friends! I am using your james-server-data-jmap project (v3.8.1) 
> in my work. And it would be great if you allow customization of 
> {code:java}
> org.apache.james.jmap.api.model.Preview  {code}
> That is, if you replace the constant 
> {code:java}
> private static final int MAX_LENGTH = 256; {code}
> with the configurable field.
> Thank you for your work!!
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to