[jira] [Updated] (FLUME-3097) CLONE - WAL data grows forever even though data is delivered in E2E

jason.lee (JIRA) Tue, 23 May 2017 20:44:27 -0700

     [ 
https://issues.apache.org/jira/browse/FLUME-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


jason.lee updated FLUME-3097:
-----------------------------
    Remaining Estimate: 12h
     Original Estimate: 12h

> CLONE - WAL data grows forever even though data is delivered in E2E
> -------------------------------------------------------------------
>
>                 Key: FLUME-3097
>                 URL: https://issues.apache.org/jira/browse/FLUME-3097
>             Project: Flume
>          Issue Type: Bug
>          Components: Master, Node, Sinks+Sources
>            Reporter: jason.lee
>            Priority: Blocker
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> With a heavy enough write load, it appears that the E2E agent WAL will get 
> into a state where data just gets constantly shuffled around between the 
> various directories / states (e.g. writing, logged, sending, sent). When this 
> happens, the WAL directories grow indefinitely until the disk is exhausted, 
> no matter how much data caused the problem.
> To reproduce:
> * Use the supplied config (or something similar).
> * Write to the agent source at a rate of > 1MB/s for a short burst (using 
> something like the provided generator below).
> * Note that data is delivered to the collectorSink but the agent WAL manager 
> constantly grows the data.
> The config:
> {code}
> n1 : execStream("tail -F datafile") | agentE2ESink("host", 12345);
> n2 : collectorSource(12345) | collectorSink("file://...", "n2-");
> {code}
> Generator:
> {code}
> perl -e 'while (1) { print $i++, "\n"; }' >> datafile
> {code}
> This looks and smells just like FLUME-430. I haven't yet examined the WAL or 
> destination data for duplicates / missing events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (FLUME-3097) CLONE - WAL data grows forever even though data is delivered in E2E

Reply via email to