Sowmya Ramesh created FALCON-1107:
-------------------------------------

             Summary: Moving recipe processing to server side
                 Key: FALCON-1107
                 URL: https://issues.apache.org/jira/browse/FALCON-1107
             Project: Falcon
          Issue Type: Sub-task
            Reporter: Sowmya Ramesh
            Assignee: Sowmya Ramesh
             Fix For: 0.7


Today Recipe cooking is a client side logic. Recipe also supports extensions 
i.e. user can cook his/her own custom recipes.
Decision to make it client side logic was for the following reasons

  *   Keep it isolated from falcon server

  *   As custom recipe cooking is supported, user recipes can introduce 
security vulnerabilities and also can bring down the falcon server

Today, falcon provides HDFS DR recipe out of the box. There is a plan to add UI 
support for DR in Falcon.
Rest API support cannot be added for recipe as it is client side processing.
If the UI is pure java script[JS] then all the recipe cooking logic has to be 
repeated in JS. This is not a feasible solution - if more recipes are added say 
DR for hive, hbase and others, UI won't be extensible.

For the above mentioned reasons Recipe should me made a server side logic.
Provided/Trusted recipes [recipes provided out of the box]  can run as Falcon 
process. Recipe cooking will be done in a new process if its custom recipe 
[user code].

For cooking of custom recipes, design proposed should consider handling 
security implications, handling the issues where the custom user code can bring 
down the Falcon server (trapping System.exit), handling  class path isolation.
Also it shouldn't in anyway destabilize the Falcon system.

There are couple of approaches which was discussed

*Approach 1:*
Custom Recipe cooking can be carried out separately in another Oozie WF, this 
will ensure isolation. Oozie already has the ability to schedule jobs as a user 
and handles all the security aspects of it.

Pros:
- Provides isolation
- Piggyback on Oozie as it already provides the required functionality

Cons:
- As recipe processing is done in different WF, from operations point of view 
user cannot figure out recipe processing status and thus adds to the 
operational pain. Operational issue with this approach is said to be the overall
apparatus needed to monitor and manage the recipe-cooking workflows.  
Oozie scheduling can bring arbitrary delays  Granted we can design around the 
limitations and make use of the strengths of the approach but it seems 
something we can avoid if we can.
- There has been few discussions to move away from Oozie as scheduling engine 
for Falcon. If this is the plan going forward its good not to add new 
functionality using oozie.

*Approach 2:*
Custom recipe cooking is done on the server side in a separate independent 
process than Falcon process I.e. It runs in a different JVM. Throttling should 
be added for how many recipe cooking processes can be launched keeping in mind 
the machine configuration.

Pros:
- Provides isolation as recipe cooking is done in a independent process

Cons:
- Performance overhead as new process is launched for custom recipe cooking
- Adds more complexity to the system

This bug will be used to move recipe processing for trusted recipes to server 
side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to