On Sun, 17 Oct 2021 at 16:43, <[email protected]> wrote:
> This is an automated email from the ASF dual-hosted git repository.
>
> humbedooh pushed a commit to branch master
> in repository
> https://gitbox.apache.org/repos/asf/incubator-ponymail-foal.git
>
> commit 2dff9351d119ddee5c5e0171991c54b1911f05b1
> Author: Daniel Gruno <[email protected]>
> AuthorDate: Sun Oct 17 17:41:51 2021 +0200
>
> Add a separate header for short bodies for stats.py
> ---
> tools/archiver.py | 5 ++++-
> tools/mappings.yaml | 2 ++
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/tools/archiver.py b/tools/archiver.py
> index 1783d0e..dc12e39 100755
> --- a/tools/archiver.py
> +++ b/tools/archiver.py
> @@ -584,6 +584,8 @@ class Archiver(object): # N.B. Also used by
> import-mbox.py
> ghash = hashlib.md5(mailaddr.encode("utf-8")).hexdigest()
>
> notes.append(["ARCHIVE: Email archived as %s at %u" %
> (document_id, time.time())])
> + body_unflowed = body.unflow() if body else ""
> + body_shortened = body_unflowed[:210] # 210 so that we can
> tell if > 200.
>
>
-1
What's so special about 200 and 210?
These numbers should be constants (with suitable docn) or possibly
configuration items.
The only bare numbers I would expect to see in code are 0 and 1 (or -1).
output_json = {
> "from_raw": msg_metadata["from"],
> @@ -603,7 +605,8 @@ class Archiver(object): # N.B. Also used by
> import-mbox.py
> "private": private,
> "references": msg_metadata["references"],
> "in-reply-to": irt,
> - "body": body.unflow() if body else "",
> + "body": body_unflowed,
> + "body_short": body_shortened,
> "html_source_only": body and body.html_as_source or False,
> "attachments": attachments,
> "forum": (lid or "").strip("<>").replace(".", "@", 1),
> diff --git a/tools/mappings.yaml b/tools/mappings.yaml
> index 4bb4978..6ad72d3 100644
> --- a/tools/mappings.yaml
> +++ b/tools/mappings.yaml
> @@ -55,6 +55,8 @@ mbox:
> type: long
> body:
> type: text
> + body_short:
> + type: text
> cc:
> type: text
> date:
>