Re: avro client fails to connect to flume

2012-11-07 Thread Alexander Lorenz
Hi, Caused by: java.io.IOException: Error connecting to localhost/127.0.0.1:41414http://127.0.0.1:41414 Caused by: java.net.ConnectException: Connection refused Can you telnet to this port? Do you have a firewall running? And did you try to bound to eth0 or 0.0.0.0? cheers - Alex On Nov

Re: FlumeNG Performance Questions

2012-11-07 Thread Hari Shreedharan
Hi Cameron, It seems like you are somehow hitting performance issues with the HDFS cluster. HDFS Sink does perform pretty well usually - so as an experiment, can you try (if you have access to that is), a) try running multiple flume agents with the same configuration on different physical

performance

2012-11-07 Thread Nathaniel Auvil
in addition to HDFS, i need to support sending events to a higher latency (network related) target which in our current implementation mitigates by using more than one thread. The model for Flume is single threaded. How do I support this with Flume? multiplex over n channels with a sink on each

Re: performance

2012-11-07 Thread Hari Shreedharan
Hi Nathaniel, What do you mean single-threaded model? Almost all of Flume's components are multithreaded - if you mean sink being driven by one thread - you can always add more sinks - and each one will be driven by its own thread. If you want to write the same data to multiple locations -

Re: performance

2012-11-07 Thread Nathaniel Auvil
it is my understanding, perhaps incorrectly, that when you start a transaction in a sink, the channel blocks until that transaction is committed. Are you saying you can have multiple sinks pulling simultaneously from a single channel and the transactional semantics will not cause blocking? On

Re: performance

2012-11-07 Thread Hari Shreedharan
The channel is a passive component. It has no notion of blocking. All the Flume channels support multiple transactions happening simultaneously - and none of these transactions block. If a channel has no events to return, the take() method will simply return null. Multiple sinks can pull events

Re: Guarantees of the memory channel for delivering to sink

2012-11-07 Thread Rahul Ravindran
Ping on the below questions about new Spool Directory source: If we choose to use the memory channel with this source, to an Avro sink on a remote box, do we risk data loss in the eventuality of a network partition/slow network or if the flume-agent on the source box dies? If we choose to use

Re: Guarantees of the memory channel for delivering to sink

2012-11-07 Thread Brock Noland
Hi, Yes if you use memory channel, you can lose data. To not lose data, file channel needs to write to disk... Brock On Wed, Nov 7, 2012 at 1:29 PM, Rahul Ravindran rahu...@yahoo.com wrote: Ping on the below questions about new Spool Directory source: If we choose to use the memory channel

Re: Guarantees of the memory channel for delivering to sink

2012-11-07 Thread Rahul Ravindran
Hi, Thanks for the response. Does the memory channel provide transactional guarantees? In the event of a network packet loss, does it retry sending the packet? If we ensure that we do not exceed the capacity for the memory channel, does it continue retrying to send an event to the remote

Re: Guarantees of the memory channel for delivering to sink

2012-11-07 Thread Brock Noland
The memory channel doesn't know about networks. The sources like avrosource/avrosink do. They operate on TCP/IP and when there is an error sending data downstream they roll the transaction back so that no data is lost. The believe the docs cover this here

Adding an interceptor

2012-11-07 Thread Rahul Ravindran
Apologies. I am new to Flume, and I am probably missing something fairly obvious. I am attempting to test using a timestamp interceptor and host interceptor but I see only a sequence of numbers in the remote end. Below is the flume config: agent1.channels.ch1.type = MEMORY

Re: Adding an interceptor

2012-11-07 Thread Hari Shreedharan
Rahul, The interceptor adds headers, not content to the body. Unless you are somehow writing the headers out, you will not see the headers in the output. The sequence of numbers you see are generated by the SEQ source - which is what it does. Hari -- Hari Shreedharan On Wednesday,

multiple agents

2012-11-07 Thread 오픈플랫폼개발팀
Hi, I have sources to collect multiple types of logs(mainly three types). Most of them generate at least two types of logs. That mean, a server generates two types of log. For my use case, I created two separate agents running on a server to collect the logs. I am running these agents in

Re: multiple agents

2012-11-07 Thread Alexander Lorenz
Hi, you can use one config-file and define the agents with flow1, flow2 and so on. I assume, the sources of the logs are different types, isn't it? When you let grab them from one agent you could create multiple flows with

RE: multiple agents

2012-11-07 Thread 오픈플랫폼개발팀
Hi Alex, Thank you for your response and inputs. Yes, we have different types of logs sources. Even, I was thinking the same solution and it's well explained in the document. Just wanted to hear from experts about the ideal way to define the agents.