Hi Yongkun,

I'm curious why you need to pull the data twice from the sink? Do you need all sinks to have read the same amount of data? Normally for the case of splitting data into batch and analytics, we will send data from the source to two separate channels and have the sinks read from separate channels.

On 08/10/2012 02:48 PM, Wang, Yongkun | Yongkun | BDD wrote:
Hi Denny,

I am working on the patch now, it's not difficult. I have listed the
changes in that JIRA.
I think you misunderstand my design, I didn't maintain the order of the
events. Instead I make sure that each sink will get the same events (or
different events specified by selector).

Suppose Channel (mc) contains the following events: 4,3,2,1

If simply enable it by configuration, it may work like this:
Sink "hsa" may get 1,3;
Sink "hsb" may get 2,4;
So different sink will get different data. Is this what user wants?


In my design, "hsa" and "hsb" will both get "4,3,2,1". This is a typical
case when user want to fan-out the data into two places (eg. One for batch
and and another for real-time analysis).

Regards,
Yongkun Wang


On 12/08/10 14:29, "Denny Ye" <[email protected]> wrote:

hi Yongkun,

   JIRA can be accessed now.

   I think it might be difficult to understand the order of events from
your thought. If we don't care about the order, can discuss the value and
feasibility.  In my opinion, data ingest flow is order unawareness, at
least, not such important for us. You can try to verify your proposal and
give us result. It may be some difficulties in keeping transaction with
several Sinks.

-Regards
Denny Ye


2012/8/10 Wang, Yongkun | Yongkun | BDD <[email protected]>

JIRA is down again? I cannot connect to it and comment there.

I have a proposal in "Transactional Multiplex (fan out) Sink"):
https://issues.apache.org/jira/browse/FLUME-1435
Which contains the design of one channel to multiple sinks.

You can search the email since JIRA cannot be accessed.

I think this is more than a configuration issue. If simply enable
several
sinks on the same channel, they will take it either in a round-robin
mode
or in a unpredictable mode if the speed of sinks are different.

So it's better to have a even higher level transaction control instead
of
the transaction in the process() of each sink, as I describe in
FLUME-1435.

Regards,
Yongkun Wang


On 12/08/10 12:30, "Denny Ye (JIRA)" <[email protected]> wrote:

Denny Ye created FLUME-1479:
-------------------------------

             Summary: Multiple Sinks can connect to single Channel
                 Key: FLUME-1479
                 URL: https://issues.apache.org/jira/browse/FLUME-1479
             Project: Flume
          Issue Type: Bug
          Components: Configuration
    Affects Versions: v1.2.0
            Reporter: Denny Ye
            Assignee: Denny Ye
             Fix For: v1.3.0


If we has one Channel (mc) and two Sinks (hsa, hsb), then they may be
connected with each other with configuration example
{quote}
agent.sinks.hsa.channel = mc
agent.sinks.hsb.channel = mc
{quote}
It means that there have multiple Sinks can connect to single Channel.
Normally, one Sink only can connect to unified Channel

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA
administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira







Reply via email to