[jira] [Comment Edited] (FLUME-2069) Issue with Flume load balancing round robin

Osama Awad (JIRA) Wed, 05 Jun 2013 13:15:46 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676291#comment-13676291
 ]


Osama Awad edited comment on FLUME-2069 at 6/5/13 8:13 PM:
-----------------------------------------------------------

Hello Jayant, here is the complete file


a1.sources = r1
a1.sinks = k1 k2 k3 k4
a1.channels = c1 c2 c3 c4

a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5140
a1.sources.r1.channels = c1 c2 c3 c4
a1.sources.r1.handler = org.apache.flume.source.http.JSONHandler

a1.sources.r1.interceptors = logging timestamp  
a1.sources.r1.interceptors.logging.type = 
com.xyz.flume.interceptors.LoggingInterceptor$Builder
a1.sources.r1.interceptors.timestamp.type = 
org.apache.flume.interceptor.TimestampInterceptor$Builder

a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text 
a1.sinks.k1.hdfs.filePrefix = events
a1.sinks.k1.hdfs.batchSize = 1000
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.threadsPoolSize = 100
a1.sinks.k1.hdfs.rollSize = 6144
a1.sinks.k1.hdfs.rollCount = 20

a1.sinks.k2.type = hdfs
a1.sinks.k2.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k2.hdfs.fileType = DataStream
a1.sinks.k2.hdfs.writeFormat = Text
a1.sinks.k2.hdfs.filePrefix = events
a1.sinks.k2.hdfs.batchSize = 1000
a1.sinks.k2.hdfs.round = true
a1.sinks.k2.hdfs.roundValue = 10
a1.sinks.k2.hdfs.roundUnit = minute
a1.sinks.k2.hdfs.threadsPoolSize = 100
a1.sinks.k2.hdfs.rollSize = 6144
a1.sinks.k2.hdfs.rollCount = 20

a1.sinks.k3.type = hdfs
a1.sinks.k3.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k3.hdfs.fileType = DataStream
a1.sinks.k3.hdfs.writeFormat = Text
a1.sinks.k3.hdfs.filePrefix = events
a1.sinks.k3.hdfs.batchSize = 1000
a1.sinks.k3.hdfs.round = true
a1.sinks.k3.hdfs.roundValue = 10
a1.sinks.k3.hdfs.roundUnit = minute
a1.sinks.k3.hdfs.threadsPoolSize = 100
a1.sinks.k3.hdfs.rollSize = 6144
a1.sinks.k3.hdfs.rollCount = 20

a1.sinks.k4.type = hdfs
a1.sinks.k4.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k4.hdfs.fileType = DataStream
a1.sinks.k4.hdfs.writeFormat = Text
a1.sinks.k4.hdfs.filePrefix = events
a1.sinks.k4.hdfs.batchSize = 1000
a1.sinks.k4.hdfs.round = true
a1.sinks.k4.hdfs.roundValue = 10
a1.sinks.k4.hdfs.roundUnit = minute
a1.sinks.k4.hdfs.threadsPoolSize = 100
a1.sinks.k4.hdfs.rollSize = 6144
a1.sinks.k4.hdfs.rollCount = 20

a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2 k3 k4
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.selector = round_robin 

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 1000

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000000
a1.channels.c2.transactionCapacity = 1000

a1.channels.c3.type = memory
a1.channels.c3.capacity = 1000000
a1.channels.c3.transactionCapacity = 1000

a1.channels.c4.type = memory
a1.channels.c4.capacity = 1000000
a1.channels.c4.transactionCapacity = 1000

a1.sources.r1.channels = c1 c2 c3 c4
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2
a1.sinks.k3.channel = c3
a1.sinks.k4.channel = c4

                
      was (Author: oawad79):
    Hello Jayant, here is the complete file

# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2 k3 k4
a1.channels = c1 c2 c3 c4

# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5140
a1.sources.r1.channels = c1 c2 c3 c4
a1.sources.r1.handler = org.apache.flume.source.http.JSONHandler

a1.sources.r1.interceptors = logging timestamp  
a1.sources.r1.interceptors.logging.type = 
com.xyz.flume.interceptors.LoggingInterceptor$Builder
a1.sources.r1.interceptors.timestamp.type = 
org.apache.flume.interceptor.TimestampInterceptor$Builder

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text 
a1.sinks.k1.hdfs.filePrefix = events
a1.sinks.k1.hdfs.batchSize = 1000
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.threadsPoolSize = 100
a1.sinks.k1.hdfs.rollSize = 6144
a1.sinks.k1.hdfs.rollCount = 20

a1.sinks.k2.type = hdfs
a1.sinks.k2.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k2.hdfs.fileType = DataStream
a1.sinks.k2.hdfs.writeFormat = Text
a1.sinks.k2.hdfs.filePrefix = events
a1.sinks.k2.hdfs.batchSize = 1000
a1.sinks.k2.hdfs.round = true
a1.sinks.k2.hdfs.roundValue = 10
a1.sinks.k2.hdfs.roundUnit = minute
a1.sinks.k2.hdfs.threadsPoolSize = 100
a1.sinks.k2.hdfs.rollSize = 6144
a1.sinks.k2.hdfs.rollCount = 20

a1.sinks.k3.type = hdfs
a1.sinks.k3.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k3.hdfs.fileType = DataStream
a1.sinks.k3.hdfs.writeFormat = Text
a1.sinks.k3.hdfs.filePrefix = events
a1.sinks.k3.hdfs.batchSize = 1000
a1.sinks.k3.hdfs.round = true
a1.sinks.k3.hdfs.roundValue = 10
a1.sinks.k3.hdfs.roundUnit = minute
a1.sinks.k3.hdfs.threadsPoolSize = 100
a1.sinks.k3.hdfs.rollSize = 6144
a1.sinks.k3.hdfs.rollCount = 20

a1.sinks.k4.type = hdfs
a1.sinks.k4.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
a1.sinks.k4.hdfs.fileType = DataStream
a1.sinks.k4.hdfs.writeFormat = Text
a1.sinks.k4.hdfs.filePrefix = events
a1.sinks.k4.hdfs.batchSize = 1000
a1.sinks.k4.hdfs.round = true
a1.sinks.k4.hdfs.roundValue = 10
a1.sinks.k4.hdfs.roundUnit = minute
a1.sinks.k4.hdfs.threadsPoolSize = 100
a1.sinks.k4.hdfs.rollSize = 6144
a1.sinks.k4.hdfs.rollCount = 20

a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2 k3 k4
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.selector = round_robin 

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 1000

a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000000
a1.channels.c2.transactionCapacity = 1000

a1.channels.c3.type = memory
a1.channels.c3.capacity = 1000000
a1.channels.c3.transactionCapacity = 1000

a1.channels.c4.type = memory
a1.channels.c4.capacity = 1000000
a1.channels.c4.transactionCapacity = 1000

# Bind the source and sink to the channel
a1.sources.r1.channels = c1 c2 c3 c4
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2
a1.sinks.k3.channel = c3
a1.sinks.k4.channel = c4

                  
> Issue with Flume load balancing round robin
> -------------------------------------------
>
>                 Key: FLUME-2069
>                 URL: https://issues.apache.org/jira/browse/FLUME-2069
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.3.1
>            Reporter: Osama Awad
>            Priority: Blocker
>
> I am not sure if this is a bug. I have an http to hdfs scenario, 3 channels 
> with 3 hdfs sinks, I have configured it to do load balancing with round 
> robin, but when the request arrive it gets replicated into all the sinks 
> instead of hitting the sink with round robin order, so I end up with the same 
> data replicated by all sinks.
> here are my configs, I have not included the source configs here
> a1.sinks.k1.type = hdfs
> a1.sinks.k1.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
> a1.sinks.k1.hdfs.fileType = DataStream
> a1.sinks.k1.hdfs.writeFormat = Text
> a1.sinks.k1.hdfs.filePrefix = events
> a1.sinks.k1.hdfs.batchSize = 1000
> a1.sinks.k2.type = hdfs
> a1.sinks.k2.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
> a1.sinks.k2.hdfs.fileType = DataStream
> a1.sinks.k2.hdfs.writeFormat = Text
> a1.sinks.k2.hdfs.filePrefix = events
> a1.sinks.k2.hdfs.batchSize = 1000
> a1.sinks.k3.type = hdfs
> a1.sinks.k3.hdfs.path = /tmp/hadoop-oawad/dfs/name2/%y-%m-%d/%H%M/%S
> a1.sinks.k3.hdfs.fileType = DataStream
> a1.sinks.k3.hdfs.writeFormat = Text
> a1.sinks.k3.hdfs.filePrefix = events
> a1.sinks.k3.hdfs.batchSize = 1000
> a1.sinkgroups = g1
> a1.sinkgroups.g1.sinks = k1 k2 k3
> a1.sinkgroups.g1.processor.type = load_balance
> a1.sinkgroups.g1.processor.selector = round_robin
> am I missing something here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (FLUME-2069) Issue with Flume load balancing round robin

Reply via email to