XMLs in the falcon

Jagat Singh Wed, 02 Apr 2014 15:29:40 -0700

Hi,

I was looking at the falcon basic userguide [1] and the recent blog post of
same by Hortonworks [2]


I was just wondering if there is some proposal to reduce the amount of XML
code needed to ingest any new feed or process into the system.

Can we have some properties globally defined in the system.

Cluster A
Cluster B etc
Cluster A  temp dir
Cluster B temp dir
Cluster A hive parent dir
Cluster B hive parent dir

And for any new feed we just need to write something similar to what we do
in cascading or pig script. 3-4 declarative steps what has to be done with
that data.

Write 3-4 lines of code and its all done.

We can generate XMLs in the background if needed to make it working , but
writing XMLs for ingesting every need feed is the most scary thing for me
at this moment to use it in Production. Imagine we have 500 feeds and how
many XMLs it will be needed to support

What are your thoughts on this.

Thanks,

Jagat Singh








[1] http://falcon.incubator.apache.org/docs/EntitySpecification.html
[2]
http://hortonworks.com/hadoop-tutorial/defining-processing-data-end-end-data-pipeline-apache-falcon/

XMLs in the falcon

Reply via email to