[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Chandra Putta updated APEXMALHAR-2037:
---------------------------------------------
    Assignee: Bhupesh Chawda  (was: Ashwin Chandra Putta)

> Pluggable component to queue tuples with ability to spool to disk
> -----------------------------------------------------------------
>
>                 Key: APEXMALHAR-2037
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2037
>             Project: Apache Apex Malhar
>          Issue Type: New Feature
>            Reporter: Ashwin Chandra Putta
>            Assignee: Bhupesh Chawda
>
> There are many use cases in which we are writing tuples to external system 
> using JDBC etc. There are instances when the external system might be slow 
> and down for some time. In those cases, the current implementation of jdbc 
> output operators fail and restart until the external system is up again. 
> Meanwhile, the DAG is slowed down by this operator. To deal with such 
> scenarios, we should write the output in a reconciled fashion where the 
> reconciler thread is writing at the pace of external system. We should also 
> provide an ability to spool the data to disk when the external system is down 
> or the output operators queue is full.
> Here are the proposed features for the output operator.
> 1. Write to external system in a separate reconciler thread.
> 2. Queue the tuples in memory for reconciler thread to consume. 
> 3. Spool the incoming tuples to hdfs using a WAL when the queue is full.
> 4. Read from WAL and write to queue as queue is being consumed.
> 5. When external system is able to consume as fast as incoming throughput, 
> WAL is not written. The queue will just buffer the tuples before writing to 
> external system.
> This can be done on the output operator as a pluggable component that will 
> queue the incoming tuples and provide a callback to dequeue the tuples to 
> write to external system. The component will use WAL to backup the tuples 
> when the queue is full.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to