Check for pull request https://github.com/apache/apex-malhar/pull/396, This tries to use java regex with group syntax to populate fields of POJO. may be you can use some of the ideas from this pull request.
On Fri, Nov 18, 2016 at 12:30 AM, Pradeep A. Dalvi <p...@apache.org> wrote: > +1 for the feature > > On Thu, 17 Nov 2016 at 16:56 Shraddha Jog <shraddha....@synerzip.com> wrote: > >> Dear community, >> >> We would like to add operator in malhar for parsing different types of >> logs. >> Idea of this operator is to read log data records of known formats such as >> Syslog, common log, combined log, extended log etc from the upstream in a >> DAG, parse/validate it based on the configured format and emit the >> validated POJO to the downstream. >> >> We are not targeting log formats from particular library as such but the >> default formats for common log, combined log, extended log and sys log. >> Also if user has some specific log format then that could be provided in a >> property and operator will parse the log according to the given format. >> >> Properties: >> LogFileFormat : Property to define data format for the log data record >> being read at the Input Port. It can be either from the above four default >> log formats or a json specifying fields and regular expression. More >> details can be found in the document. >> >> Ports : >> 1. ParsedLog: This port shall emit the parsed/validated POJO object created >> based on the log format configured by the user. >> >> 2. ErrorPort: This port shall emit the error log data record. >> >> Proposed design can be found here >> < >> https://docs.google.com/document/d/1RoTOUx_0chwTSahGxiIXlgACgRNfXiVv17hMezxFz74/edit?usp=sharing >> > >> . >> >> Thanks, >> Shraddha >> >> -- >> This e-mail, including any attached files, may contain confidential and >> privileged information for the sole use of the intended recipient. Any >> review, use, distribution, or disclosure by others is strictly prohibited. >> If you are not the intended recipient (or authorized to receive information >> for the intended recipient), please contact the sender by reply e-mail and >> delete all copies of this message. >> >>