[
https://issues.apache.org/jira/browse/APEXMALHAR-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401624#comment-15401624
]
ASF GitHub Bot commented on APEXMALHAR-2153:
--------------------------------------------
Github user chinmaykolhatkar commented on a diff in the pull request:
https://github.com/apache/apex-malhar/pull/353#discussion_r72931886
--- Diff: docs/operators/enricher.md ---
@@ -0,0 +1,169 @@
+POJO Enricher
+=============
+
+## Operator Objective
+This operator receives an POJO ([Plain Old Java
Object](https://en.wikipedia.org/wiki/Plain_Old_Java_Object)) as an incoming
tuple and uses an external source to enrich the data in
+the incoming tuple and finally emits the enriched data as a new enriched
POJO.
+
+POJOEnricher supports enrichment from following external sources:
+
+1. **JSON File Based** - Reads the file in memory having content stored in
JSON format and use that to enrich the data. This can be done using FSLoader
implementation.
+2. **JDBC Based** - Any JDBC store can act as an external entity to which
enricher can request data for enriching incoming tuples. This can be done using
JDBCLoader implementation.
+
+POJO Enricher does not hold any state and is **idempotent**,
**fault-tolerance** and **statically/dynamically partitionable**.
+
+## Operator Usecase
+1. Bank ***transaction records*** usually contains customerId. For further
analysis of transaction one wants the customer name and other customer related
information.
+Such information is present in another database. One could enrich the
transaction's record with customer information using POJOEnricher.
+2. ***Call Data Record (CDR)*** contains only mobile/telephone numbers of
the customer. Customer information is missing in CDR. POJO Enricher can be used
to enricher
+CDR with customer data for further analysis.
+
+## Operator Information
+1. Operator location: ***malhar-contrib***
+2. Available since: ***3.4.0***
+3. Operator state: ***Evolving***
+3. Java Packages:
+ * Operator:
***[com.datatorrent.contrib.enrich.POJOEnricher](https://www.datatorrent.com/docs/apidocs/com/datatorrent/contrib/enrich/POJOEnricher.html)***
+ * FSLoader:
***[com.datatorrent.contrib.enrich.FSLoader](https://www.datatorrent.com/docs/apidocs/com/datatorrent/contrib/enrich/FSLoader.html)***
+ * JDBCLoader:
***[com.datatorrent.contrib.enrich.JDBCLoader](https://www.datatorrent.com/docs/apidocs/com/datatorrent/contrib/enrich/JDBCLoader.html)***
+
+## Properties, Attributes and Ports
+### <a name="props"></a>Properties of POJOEnricher
+| **Property** | **Description** | **Type** | **Mandatory** | **Default
Value** |
+| -------- | ----------- | ---- | ------------------ | ------------- |
+| *includeFields* | List of fields from database that needs to be added to
output POJO. | List<String\> | Yes | N/A |
+| *lookupFields* | List of fields from input POJO which will form a
*unique composite* key for querying to database | List<String\> | Yes | N/A |
+| *store* | Backend Store from which data should be queried for enrichment
| [BackendStore](#backendStore) | Yes | N/A |
+| *cacheExpirationInterval* | Cache entry expiry in ms. After this time,
the lookup to store will be done again for given key | int | No | 1 * 60 * 60 *
1000 (1 hour) |
+| *cacheCleanupInterval* | Interval in ms after which cache will be
removed for any stale entries. | int | No | 1 * 60 * 60 * 1000 (1 hour) |
+| *cacheSize* | Number of entry in cache after which eviction will start
on each addition based on LRU | int | No | 1000 |
+
+#### <a name="backendStore"></a>Properties of FSLoader (BackendStore)
+| **Property** | **Description** | **Type** | **Mandatory** | **Default
Value** |
+| -------- | ----------- | ---- | ------------------ | ------------- |
+| *fileName* | Path of the file, the data from which will be used for
enrichment. See [here](#JSONFileFormat) for JSON File format. | String | Yes |
N/A |
+
+
+#### Properties of JDBCLoader (BackendStore)
+| **Property** | **Description** | **Type** | **Mandatory** | **Default
Value** |
+| -------- | ----------- | ---- | ------------------ | ------------- |
+| *databaseUrl* | Connection string for connecting to JDBC | String | Yes
| N/A |
+| *databaseDriver* | JDBC Driver class for connection to JDBC Store. This
driver should be there in classpath | String | Yes | N/A |
+| *tableName* | Name of the table from which data needs to be retrieved |
String | Yes | N/A |
+| *connectionProperties* | Command seperated list of advanced connection
properties that need to be passed to JDBC Driver. For eg.
*prop1:val1,prop2:val2* | String | No | null |
+| *queryStmt* | Select statement which will be used to query the data.
This is optional parameter in case of advanced query. | String | No | null |
+
+
+
+### Platform Attributes that influences operator behavior
+| **Attribute** | **Description** | **Type** | **Mandatory** |
+| -------- | ----------- | ---- | ------------------ |
+| *input.TUPLE_CLASS* | TUPLE_CLASS attribute on input port which tells
operator the class of POJO which will be incoming | Class or FQCN| Yes |
+| *output.TUPLE_CLASS* | TUPLE_CLASS attribute on output port which tells
operator the class of POJO which need to be emitted | Class or FQCN | Yes |
+
+
+### Ports
+| **Port** | **Description** | **Type** | **Mandatory** |
+| -------- | ----------- | ---- | ------------------ |
+| *input* | Tuple which needs to be enriched are received on this port |
Object (POJO) | Yes |
+| *output* | Tuples that are enriched from external source are emitted
from on this port | Object (POJO) | No |
+
+## Limitations
+Current POJOEnricher contains following limitation:
+
+1. FSLoader loads the file content in memory. Though it loads only the
composite key and composite value in memory, a very large amount of data would
bloat the memory and make the operator go OOM. In case the filesize is large,
allocate sufficient memory to the POJOEnricher.
+2. Incoming POJO should be a subset of outgoing POJO.
--- End diff --
Yes.. That's necessary.
> Add user documentation for Enricher on apex docs
> ------------------------------------------------
>
> Key: APEXMALHAR-2153
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2153
> Project: Apache Apex Malhar
> Issue Type: Documentation
> Reporter: Chinmay Kolhatkar
> Assignee: Chinmay Kolhatkar
>
> Add user documentation for Enricher on apex docs
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)