[ 
https://issues.apache.org/jira/browse/NIFI-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953082#comment-14953082
 ] 

Bryan Bende commented on NIFI-901:
----------------------------------

I think writing to a single cell is fine. I've been working on the H-Base 
processors from where Mark left off... one thing I did was grab a batch of 
FlowFiles in onTrigger and try to group them by table and row to narrow down 
the number of operations to submit in cases where all data is for one table, 
but the incoming FlowFiles are for one cell each. Don't know enough about the 
Cassandra client, but maybe a similar approach could make sense. 

The Get side will likely be a lot more challenging, the idea would be to 
extract cells/rows that have changed since the last time the processor ran, and 
also save the state such that if the primary node of a cluster changes then the 
processor could pick up where it left off on a new primary node. In H-Base we 
are using the setTimeRange() method on the Scan object to scan for cells with a 
timestamp greater than last execution time.

https://issues.apache.org/jira/browse/NIFI-817

> Create processors to get/put data with Apache Cassandra
> -------------------------------------------------------
>
>                 Key: NIFI-901
>                 URL: https://issues.apache.org/jira/browse/NIFI-901
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Joseph Witt
>              Labels: beginner
>             Fix For: 0.4.0
>
>
> Develop processors to interact with Apache Cassandra.  The current http 
> processors may actually support this as is but such configuration may be too 
> complex to provide the quality user experience desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to