Re: Improvement for Chukwa Agent and Collector

Eric Yang Mon, 09 Aug 2010 17:04:51 -0700

I like to have /v1/ at least to identify the URL versioning.  Just to be
safe, if we change URL in the future.  / and /tool to point to information
UI make sense.


Regards,
Eric

On 8/9/10 2:59 PM, "Bill Graham" <[email protected]> wrote:

> I generally feel that all params should be able to be passed either entirely
> in the body or entirely in the URI regardless which ones are required/optional
> (with the exception of the asset id, which typically is in the path
> regardless). I vote for passing them all in the body as a json blob in this
> case (if Content-Type is set to application/json that is).
> 
> Thinking more about the base path to the API that I proposed, perhaps the
> /v1.0 in the URL is overkill. I could go for removing that part. The /rest
> path has value though to me though, because I could see keeping '/' or '/tool'
> to potentially point to an HTML summary page or mini-UI at some point.
> 
> 
> 
> On Mon, Aug 9, 2010 at 2:42 PM, Eric Yang <[email protected]> wrote:
>> Hi Bill,
>> 
>> I like your design better.  +1 on the revised version.  RecordType and
>> Adaptor are required parameters, would it make sense if we could put them on
>> the path parameters for POST?
>> 
>> Regards,
>> Eric
>> 
>> On 8/9/10 11:33 AM, "Bill Graham" <[email protected]> wrote:
>> 
>>> I agree that we should implement the features you suggest. I've been
>>> thinking about a REST API for the agents lately, as I'd also like to be able
>>> to expose statistics to help with monitoring. Something similar to what the
>>> collector does so you can attach monitoring to a URL see if the average data
>>> rate suddenly drops.
>>> 
>>> Regarding the proposed API protocol, I think we should use POST, GET and
>>> DELETE to create, fetch and remove adaptors, similar to how you propose, but
>>> the identifier in the rest resource should be the adaptor id, not the
>>> filename. This is more RESTful since the adaptor is the thing being
>>> accessed, not the file. Also, you could have more than one adaptor on a
>>> given file and some adaptors (i.e., JMSAdaptor) don't have a file associated
>>> with them.
>>> 
>>> I propose something like this:
>>> 
>>> - Add Adaptor:
>>> 
>>> POST /rest/v1.0/adaptor HTTP/1.0
>>> Accept: text/plain
>>> Content-Type: application/json
>>> { "RecordType" : "jvm", "Cluster": "demo", adaptor configs including offset,
>>> other tags ... }
>>> 
>>> Returns: adaptor metadata including id
>>> 
>>> - Get Adaptor fcb0fe44e9dd6d2283962cb0e3b4ea0f:
>>> 
>>> GET /rest/v1.0/adaptor/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
>>> 
>>> - Remove Adaptor fcb0fe44e9dd6d2283962cb0e3b4ea0f:
>>> 
>>> DELETE /rest/v1.0/adaptor/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
>>> 
>>> - List all adaptors:
>>> GET /rest/v1.0/adaptor HTTP/1.0
>>> 
>>> - Help
>>> GET /rest/v1.0/help HTTP/1.0
>>> 
>>> - Statistics for all adaptors
>>> GET /rest/v1.0/adaptorStats HTTP/1.0
>>> 
>>> - Statistics for a single adaptor
>>> GET /rest/v1.0/adaptorStats/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
>>> 
>>> Thoughts?
>>> 
>>> thanks,
>>> Bill
>>> 
>>> On Mon, Aug 9, 2010 at 10:01 AM, Eric Yang <[email protected]> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>>  Chukwa Agent has a custom command protocol (port 9093).  The current
>>>> protocol is not easy to modify to implement security related features such
>>>> as authentication and authorization.  I would like to propose that we use
>>>> web service REST like protocol to improve security and be more aligned with
>>>> web standards.  Let¹s go through the use cases of Chukwa Agent command
>>>> protocol:
>>>> 
>>>> Start an adaptor:
>>>> 
>>>> Current command: Add
>>>> 
>>>> 
org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingA>>>>
d
>>>> aptorUTF8NewLineEscaped
>>>> /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log 0
>>>> 
>>>> Proposed:
>>>> POST /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HTTP/1.0
>>>> Accept: chukwa/UTF8NewLineEscaped (optional)
>>>> Offset: 0 (optional)
>>>> Content-Type: application/json
>>>> { ³RecordType² : ³jvm², "Cluster": "demo", other tags ... }
>>>> 
>>>> List adaptors:
>>>> 
>>>> Current command: List
>>>> 
>>>> Proposed:
>>>> GET / HTTP/1.0
>>>> Accept: text/html
>>>> Get list of information about all streaming adatpors
>>>> 
>>>> HEAD /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HTTP/1.0
>>>> or
>>>> HEAD /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
>>>> Get information about the streaming adaptor only.
>>>> 
>>>> Stop adaptors:
>>>> 
>>>> Current command: Stop adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f
>>>> 
>>>> Proposed:
>>>> DELETE /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log
>>>> HTTP/1.0 or
>>>> DELETE /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
>>>> Delete the adaptor
>>>> 
>>>> Help:
>>>> Current command: Help
>>>> 
>>>> Proposed:
>>>> GET /help HTTP/1.0
>>>> Accept: text/html
>>>> 
>>>> With this modification, we can support encryption and Basic/Digest
>>>> Authentication from existing libraries without reinvent the wheel.  If the
>>>> community is ok with this change, I would like to propose the next
>>>> improvement:
>>>> 
>>>> Chukwa Agent and collectors are two different feature sets, but there
>>>> shouldn¹t be any road block to build a switch to toggle the machine to
>>>> serve
>>>> different responsibilities.  For example, a chukwa agent machine can flip a
>>>> switch to join collector pool and continue to stream data from itself.
>>>>  With
>>>> this improvement, it is more easily to dynamically create bigger data
>>>> collection pipeline on the fly.  Both system use the same communication
>>>> protocol, hence it is easier to manage.  In the future, we can add addition
>>>> commands like TRACE /config/reload to reload configuration, and tap into
>>>> ZooKeeper for managing data flow in centralized configuration management.
>>>> 
>>>> Any thoughts?
>>>> 
>>>> Regards,
>>>> Eric
>>>> 
>>>> 
>>> 
>> 
> 
>

Re: Improvement for Chukwa Agent and Collector

Reply via email to