[ 
https://issues.apache.org/jira/browse/HADOOP-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663552#action_12663552
 ] 

jboulon edited comment on HADOOP-5018 at 1/13/09 3:57 PM:
----------------------------------------------------------------

My point is hat the pipelineWriter should be an implementation of the 
ChukwaWriter interface and that's really the only thing that the collector 
should be aware of.
So to be able to do what you want:

1) The collector should instantiate one writer implementation based on his 
configuration
2) The writer should be able to get the collector configuration from somewhere 
(current design) or should have an init method with a Configuration parameter
3) The contract from the collector point of view is the same call one method on 
the writer class and the result is success if there's no exception

the delta with your implementation is:

- Remove code from     if (conf.get("chukwaCollector.pipeline") != null) ..
- Replace by something like:

writerClassName = 
conf.get("chukwaCollector.writer","org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter").
Class myWriter = conf.getClassByName(writerClassName);
Writer st = myWriter.newInstance()
st.init();

- Remove all writer initialization from CollectorStub.java
- and move all the code to create the pipeline to the init method inside a 
PipelineWriter class, instead of ServletCollector.java

That way the writer interface is still simple, the collector class stay very 
simple and this does not prevent anybody from having a specific writer 
implementation.
So at the end you have:

public class PipelineWriter implements ChukwaWriter
{

public void init() throws WriterException
{
+    if (conf.get("chukwaCollector.pipeline") != null) {
+      String pipeline = conf.get("chukwaCollector.pipeline");
+      try {
+        String[] classes = pipeline.split(",");
+        ArrayList<PipelineStageWriter> stages = new 
ArrayList<PipelineStageWriter>();
[...]
}

public void add(List<Chunk> chunks) throws WriterException
{ // call all PipelineStageWriter in sequence }

}



      was (Author: jboulon):
    My point is hat the pipelineWriter should be an implementation of the 
ChukwaWriter interface and that's really the only thing that the collector 
should be aware of.
So to be able to do what you want:

1) The collector should instantiate one writer implementation based on his 
configuration
2) The writer should be able to get the collector configuration from somewhere 
(current design) or should have an init method with a Configuration parameter
3) The contract from the collector point of view is the same call one method on 
the writer class and the result is success if there's no exception

the delta with your implementation is:

- Remove code from     if (conf.get("chukwaCollector.pipeline") != null) ..
- Replace by something like:

writerClassName = 
conf.get("chukwaCollector.writer","org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter").
Class myWriter = conf.getClassByName(writerClassName);
Writer st = myWriter.newInstance()
st.init();

- Remove all writer initialization from CollectorStub.java
- and move all the code to create the pipeline to the init method inside a 
PipelineWriter class, instead of ServletCollector.java

That way the writer interface is still simple, the collector class stay very 
simple and this does not prevent anybody from having a specific writer 
implementation.
So at the end you have:

public class PipelineWriter implements ChukwaWriter
{

public void init() throws WriterException
{
+    if (conf.get("chukwaCollector.pipeline") != null) {
+      String pipeline = conf.get("chukwaCollector.pipeline");
+      try {
+        String[] classes = pipeline.split(",");
+        ArrayList<PipelineStageWriter> stages = new 
ArrayList<PipelineStageWriter>();
[...]
}

public void add(List<Chunk> chunks) throws WriterException
{
// call all PipelineStageWriter in sequence
}

}


  
> Chukwa should support pipelined writers
> ---------------------------------------
>
>                 Key: HADOOP-5018
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5018
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/chukwa
>            Reporter: Ari Rabkin
>            Assignee: Ari Rabkin
>         Attachments: pipeline.patch
>
>
> We ought to support chaining together writers; this will radically increase 
> flexibility and make it practical to add new features without major surgery 
> by putting them in pass-through or filter classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to