Hi all,

We are initiating this mail thread to discuss mimicking the synapse message
context in the golang implementation. We need to maintain the incoming
message payloads, headers and properties in the message context.

In this effort we have to consider the following aspects

   -

   Lazy parsing : parse only when necessary
   -

   Support widely used media types : Ex: XML, JSON, Form
   URLencoded,multipart, plain text, binary,json-badgerfish
   -

   Conversion between data types
   -

   Accessing contents inside payload. Ex: Xpath, JSONPath
   -

   Large payload processing

Based on our research regarding the management of message context within
the Synapse mediation engine, particularly in the context of rewriting it
in Go, we have explored how to implement various message-building and
parsing functionalities using Go's standard libraries. Below is a summary
of our findings, along with the challenges we encountered during this
process.

------------------------------

1. How do we achieve lazy parsing in Go?

By analyzing our mediation sequences, we can identify steps that require
access to the message context (EX: sequence contains a content-aware
mediator). Parsing should be triggered only when such steps are
encountered, avoiding unnecessary processing for steps that do not interact
with the message content.

------------------------------

2. How can we support a wide variety of media types (XML, JSON, form data,
multipart, plain text, binary, JSON Badgerfish) in our Go implementation?

Below are the standard go libraries for handling each data type.

Media Type

Go library

XML & JSON

encoding/xml & encoding/json

Form URL encoded

net/url & net/http via methods like req.ParseForm()

Multipart Data

mime/multipart

Plain Text (text/plain)

io via methods like io.ReadAll()

Binary data

io & bytes

JSON badgerfish

No standard go library.But found alternative custom library

msievers/badgerfish <https://github.com/msievers/badgerfish> (MIT licence)

Protocol buffers

Protobuf module <https://pkg.go.dev/google.golang.org/protobuf>

------------------------------

3. What strategy should we use for converting between different data types
during message transformation?

Our suggested approach is to do the data-type conversion only when
necessary. (EX: doing XSLT transformation on a JSON request payload). Once
converted we can remove the previous representation since it is no longer
valid.

------------------------------

4. How can we efficiently access specific parts of the payload (e.g., via
XPath for XML or JSONPath for JSON)?

Xpath and jsonpath support is not in the standard go libraries but found
some third party libraries provide similar functionalities:

   -

   For XML: antchfx/xmlquery <https://github.com/antchfx/xmlquery> (MIT
   Licence) supports XPath(1.0/2.0)


   -

   For JSON: theory/jsonpath <https://github.com/theory/jsonpath> (MIT
   Licence) offer RFC 9535 JSONPath

------------------------------

5. Large payload processing?

For large payloads, stream processing is essential. Instead of loading
entire messages into memory, process data incrementally using streaming
decoders and pipelines to transform only necessary portions on demand.
Utilizing external storage, such as temporary or memory-mapped files, can
further reduce memory overhead during mediation. This approach ensures
efficient handling of large messages without overwhelming system resources.

We look forward to your thoughts on these approaches and any further
suggestions especially handling large payloads.

Best regards,

Thisara Weerakoon

Reply via email to