I would like to see:

* Leader election

* Ability to balance channels across a cluster

* A discussion or design around better fault-tolerance: if a worker goes down, how would we process its data elsewhere? (a short-term solution could be adding Kafka-based channels)

* A discussion or design for balancing work across a cluster: pulling data from HDFS has to be done in a single node, but if processors supported some notion of pending work and could balance it across the cluster, that would be helpful. For pulling from HDFS, that would be listing the paths to process, then pulling them in parallel and marking the path/task finished. This should be fault-tolerant so even if a node goes down, another node does the work (otherwise we could just use a simple partitioning scheme).

Obviously, these get complicated quick. But, I think some of these features would really help adoption.

rb

On 05/14/2015 07:29 PM, Joe Witt wrote:
All,

With the 0.1.0 release hopefully soon available it is time to turn
towards the next release or so and get a sense of what we should focus
on.

This should include both 0.1.x but also 0.2.0.

Obviously first and foremost we need to work the existing PRs and
patches that exist.

Beyond that we have slated the following so far:

0.1.1
https://issues.apache.org/jira/browse/NIFI/fixforversion/12332286/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-summary-panel

0.2.0
https://issues.apache.org/jira/browse/NIFI/fixforversion/12329653/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-summary-panel

But what i'd like to throw out for general discussion is what are some
of the bigger thematic things we should focus on?  Things which will
help further with community growth and utility of the product for
those folks using it now?

For example:
I'd like to see us start digging into the cluster robustness issues
(HA cluster manager w/ legit leader election, etc..).  But there are
other things as well that may be more important sooner.

Please share your thoughts as this is a great time to effect those releases.

Thanks
Joe



--
Ryan Blue
Software Engineer
Cloudera, Inc.

Reply via email to