[ANNOUNCE] Change of Apache Flume PMC Chair

2019-03-23 Thread Mike Percy
Dear Flume community, I have had the opportunity to serve as the Flume PMC Chair for the last year and some months, and for personal reasons have decided to step down at this time. I am very happy to announce that based on the PMC's recommendation, the Apache Foundation board has appointed

Re: [ANNOUNCE] New Flume PMC member - Ferenc Szabo

2019-01-30 Thread Mike Percy
Congratulations Ferenc and welcome to the PMC! Thanks so much for all your diligence and for running the 1.9.0 release! Best regards, Mike On Wed, Jan 30, 2019 at 8:09 AM Denes Arvay wrote: > Hello Flume community, > > On behalf of the Apache Flume PMC I am pleased to announce that Ferenc >

Re: Append existing Avro file - HDFS Sink

2018-10-12 Thread Mike Percy
Also consider setting up a Spark job or similar (Impala, Hive) to periodically read the Avro files and output in a columnar format (Parquet or ORC) which would give you small-files compaction (assuming you delete the source files periodically) and better analytical read performance on the columnar

Re: [ANNOUNCE] New Flume PMC Chair

2018-01-18 Thread Mike Percy
. Best regards, Mike On Thu, Jan 18, 2018 at 12:37 PM, Hari Shreedharan <hshreedha...@apache.org> wrote: > Hi all, > > It gives me immense happiness to announce that the Apache Software > Foundation Board has appointed Mike Percy as the new PMC chair of the > Apache Flume Project.

[ANNOUNCE] New Flume committers and PMC member

2017-11-07 Thread Mike Percy
done! Best regards, Mike Percy on behalf of the Flume PMC

Re: [ANNOUNCE] Apache Flume 1.8.0 released

2017-10-06 Thread Mike Percy
Congrats all! Nice work. Regards, Mike On Wed, Oct 4, 2017 at 9:57 AM, Denes Arvay wrote: > The Apache Flume team is pleased to announce the release of Flume > version 1.8.0. > > Flume is a distributed, reliable, and available service for efficiently > collecting,

Re: Avro to Parquet conversion

2017-08-30 Thread Mike Percy
I know that this reply is quite late. I'm not aware of any Flume Parquet writer that currently exists. If it was me I would stream it to HDFS in Avro format and then use an ETL job (perhaps via Spark or Impala) to convert the Avro to Parquet in large batches. Parquet is well suited to large

[ANNOUNCE] New Flume committer - Denes Arvay

2017-05-21 Thread Mike Percy
On behalf of the Apache Flume PMC, I am very pleased to welcome Denes Arvay as a committer on the Apache Flume project. Denes has put a lot of effort into improving the stability of Flume, most recently focusing on identifying and fixing serious and hard-to-diagnose issues including several bugs

Re: [ANNOUNCE] Apache Flume 1.7.0 released

2016-10-18 Thread Mike Percy
Woot! Congrats everyone! Thanks Donat for working so hard to get this version of Flume out the door! Best, Mike On Tue, Oct 18, 2016 at 10:09 AM, Bessenyei Balázs Donát wrote: > The Apache Flume team is pleased to announce the release of Flume > version 1.7.0. > > Flume is

Re: Which theme flume.apache.org use?

2016-10-17 Thread Mike Percy
If you want to improve the web site look + feel, you can check out the source code for the web site at http://svn.apache.org/viewvc/flume/site/trunk/ If you want to submit a patch for the web site, please let us know what you're trying to do on d...@flume.apache.org so we can provide some

[ANNOUNCE] Two new Flume committers

2016-09-19 Thread Mike Percy
Hi Apache Flume community, I am very happy to announce that the Flume PMC has voted to add Bessenyei Balázs Donát and Jeff Holoman as committers in recognition of their contributions to Flume. Over the past few months, Donat has contributed and reviewed many patches, more than any non-committer.

Re: File left as OPEN_FOR_WRITE state.

2016-07-26 Thread Mike Percy
I believe that is supported as of Flume 1.5.0: http://flume.apache.org/FlumeUserGuide.html#hdfs-sink See hdfs.retryInterval If you think there is a problem with that behavior, please file a bug. Regards, Mike On Wed, Jul 20, 2016 at 1:43 AM, no jihun wrote: > In fact

Re: Flume 1.7.0 Release Schedule

2016-07-12 Thread Mike Percy
There is discussion on the dev list about a Flume 1.7.0 release, but no committed date yet. We're making some progress towards that goal, though. Mike On Fri, Jun 10, 2016 at 10:06 PM, Jason Williams wrote: > Hey Joe, > > I may try 1.7 if you think it's pretty stable

Re: Overwriting guava library in flume

2016-07-12 Thread Mike Percy
It's because Hadoop is currently stuck on Guava 11, and Guava is notoriously backwards incompatible. You can read the comment here for some more (slightly tangential) info:

Re: [ANNOUNCE] Change of Apache Flume PMC Chair

2015-10-23 Thread Mike Percy
Congrats Hari! On Wed, Oct 21, 2015 at 5:50 PM, Arvind Prabhakar wrote: > Dear Flume Users and Developers, > > I have had the pleasure of serving as the PMC Chair of Apache Flume since > its graduation three years ago. I sincerely thank you and the Flume PMC for > this

Re: [ANNOUNCE] New Flume committer - Johny Rufus

2015-06-19 Thread Mike Percy
Congratulations on your committership Rufus, it's well deserved! Best, Mike On Fri, Jun 19, 2015 at 1:38 PM, Hari Shreedharan hshreedha...@apache.org wrote: On behalf of the Apache Flume PMC, I am excited to welcome Johny Rufus as a committer on the Apache Flume project. Johny has actively

Re: How to convert *.bz2.tmp to *.bz2 file after restating the instance

2014-11-12 Thread Mike Percy
Depending on your configuration setup, every batch is likely writing a stream of bzip2 and these are effectively concatenated together into a single file. So Hive should (hopefully) be reading all of them except the last (partial) batch, which is OK to throw away because Flume will retry it when

Re: [ANNOUNCE] New Flume PMC Member - Roshan Naik

2014-11-10 Thread Mike Percy
Welcome Roshan, and congrats. Mike On Wed, Nov 5, 2014 at 5:50 PM, Gwen Shapira gshap...@cloudera.com wrote: Congrats, Roshan :) Very much deserved. On Tue, Nov 4, 2014 at 2:12 PM, Arvind Prabhakar arv...@apache.org wrote: On behalf of Apache Flume PMC, it is my pleasure to announce

Re: one-to-many interceptor

2014-06-25 Thread Mike Percy
these shenanigans. But it's nice to know that I _can_ do it this way if I need to. Cheers -mt On Tue, Jun 24, 2014 at 7:45 PM, Mike Percy mpe...@apache.org wrote: Hi Matt, If you can guarantee there are a certain # of events in a single wrapper event, or bound the limit, then you could

Re: flume starting through service

2014-06-25 Thread Mike Percy
/flume-ng/conf/flume.conf -n agent. but no result in the output. On Wed, Jun 25, 2014 at 10:27 AM, Mike Percy mpe...@apache.org wrote: Ok so what's the error? Mike Sent from my iPhone On Jun 24, 2014, at 10:44 PM, kishore alajangi alajangikish...@gmail.com wrote: Flume 1.4.0

Re: one-to-many interceptor

2014-06-24 Thread Mike Percy
Hi Matt, If you can guarantee there are a certain # of events in a single wrapper event, or bound the limit, then you could potentially get away with this. However if you're not careful you could get stuck in an infinite fail-backoff-retry loop due to exceeding the (configurable) channel

Re: flume starting through service

2014-06-24 Thread Mike Percy
Can you please provide details on the errors you are seeing? What version of Flume? On Thu, Jun 19, 2014 at 12:43 AM, kishore alajangi alajangikish...@gmail.com wrote: The flume is writing to hdfs when I start flume manually through config file like flume-ng agent -c /etc/flume-ng/conf -f

Re: Json over netcat source

2014-04-10 Thread Mike Percy
Not sure either but make sure you're using a compatible version of ElasticSearch. Sent from my iPhone On Apr 9, 2014, at 9:43 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: Then I really don't know what the issue is. Someone more familiar with elastic search sink will need to

Re: Adding SSL peer cert info to AvroSource

2014-02-07 Thread Mike Percy
, maybe it's applicable here. http://stackoverflow.com/questions/9573894/set-up-netty-with-2-way-ssl-handsake-client-and-server-certificate Mike -Charles On Jan 30, 2014, at 7:21 PM, Mike Percy mpe...@apache.org wrote: I am not an expert in the JSSE API, so without specifics regarding APIs

Re: Adding SSL peer cert info to AvroSource

2014-02-07 Thread Mike Percy
On Fri, Feb 7, 2014 at 5:15 PM, Pritchard, Charles X. -ND charles.x.pritchard@disney.com wrote: I’m finding it a challenge to see where in the AvroSource class I could actually push the data into Event headers. All of those methods are stateless when it comes to the connection — they

Re: Adding SSL peer cert info to AvroSource

2014-02-07 Thread Mike Percy
cert, of course, wouldn’t make any sense. Most of them aren’t using SSL either, as it’s within a trusted network at that point. Yeah, that was my point. :) Mike -Charles On Feb 7, 2014, at 6:33 PM, Mike Percy mpe...@apache.org wrote: On Fri, Feb 7, 2014 at 5:15 PM, Pritchard, Charles

Re: Adding SSL peer cert info to AvroSource

2014-01-30 Thread Mike Percy
. From: Mike Percy [mpe...@apache.org] Sent: Wednesday, January 29, 2014 6:44 PM To: user@flume.apache.org Subject: Re: Adding SSL peer cert info to AvroSource If it's using a signed cert then what do you need to put into the filter? You mean a list of allowed peers? If so then you could either

Re: Adding SSL peer cert info to AvroSource

2014-01-29 Thread Mike Percy
If it's using a signed cert then what do you need to put into the filter? You mean a list of allowed peers? If so then you could either try to piggyback on the IpFilter and make it accept hostnames, or yes add another filter config option such as hostFilter. Mike On Wed, Jan 29, 2014 at 12:23

Re: Extra information being delivered via Flume

2013-10-11 Thread Mike Percy
Check out the latest trunk code... We just committed FLUME-1666 courtesy of Jeff Lord this week. Mike Sent from my iPhone On Oct 10, 2013, at 11:56 AM, DSuiter RDX dsui...@rdx.com wrote: Hi all, We set up a pipeline to get rsyslog input from a remote server via TCP using rsyslog

Re: Extra information being delivered via Flume

2013-10-11 Thread Mike Percy
Or if that doesn't work try the Netcat source. Sent from my iPhone On Oct 10, 2013, at 11:46 PM, Mike Percy mpe...@apache.org wrote: Check out the latest trunk code... We just committed FLUME-1666 courtesy of Jeff Lord this week. Mike Sent from my iPhone On Oct 10, 2013, at 11:56

Flume user meetup @ Hadoop World NYC on Oct 29th (Tue)

2013-10-10 Thread Mike Percy
Hi all, We are hosting a Flume user meetup during Strata / Hadoop World in New York on Tuesday, Oct 29th @ 6:30PM at the Hilton, which is the conference venue. This is your chance to meet up with other users, committers and PMC members, bat around ideas, bounce problems off of each other, and

Re: [ANNOUNCE] New Flume Committer - Wolfgang Hoschek

2013-09-24 Thread Mike Percy
Congrats Wolfgang, and welcome! Mike On Tue, Sep 24, 2013 at 3:46 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations Wolfgang, well done! Jarcec On Tue, Sep 24, 2013 at 03:39:12PM -0700, Hari Shreedharan wrote: On behalf of the Apache Flume PMC, I am excited to welcome

Re: [ANNOUNCE] New Flume Committer - Roshan Naik

2013-09-24 Thread Mike Percy
Congrats Roshan, welcome! Mike On Tue, Sep 24, 2013 at 3:47 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations Roshan, well done! Jarcec On Tue, Sep 24, 2013 at 03:39:13PM -0700, Hari Shreedharan wrote: On behalf of the Apache Flume PMC, I am excited to welcome Roshan Naik

Re: Block Under-replication detected. Rotating file.

2013-08-22 Thread Mike Percy
Are you sure your HDFS cluster is configured properly? How big is the cluster? It's complaining that your HDFS blocks are not replicated enough based on your configured replication factor, and tries to get a sufficiently replicated pipeline by closing the current file and opening a new one to

[ANNOUNCE] Apache Flume 1.4.0 released

2013-07-02 Thread Mike Percy
Jarek Jarcec Cecho Jeff Lord Joey Echeverria Jolly Chen Juhani Connolly Mark Grover Mike Percy Mubarak Seyed Nitin Verma Oliver B. Fischer Patrick Wendell Paul Chavez Pedro Urbina Escos Phil Scala Rahul Ravindran Ralph Goers Roman Shaposhnik Roshan Naik Sravya Tirukkovalur Steve Hoffman Ted Malaska

Re: Apache Flume meetup at Hadoop Summit

2013-06-25 Thread Mike Percy
This event is tonight! Hope to see many of you there. Mike On Tue, Jun 25, 2013 at 12:58 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: Hi all, I am sorry if this is a bit late, but I''d like to invite you all to the Flume meetup at Hadoop Summit in San Jose, CA. Please see

Re: Expirience in using Apache Flume in OSGi environment

2013-05-22 Thread Mike Percy
Andrey, I don't know of anyone doing that and I'd be surprised if you didn't run into some issues. We try to avoid static instances but who knows. I would try to just use the Flume client API (SDK) in your app and deploy Flume as a normal daemon. Mike On Tue, May 21, 2013 at 5:31 AM, Andrey

Re: Spooling fileSuffix attribute ignored

2013-05-22 Thread Mike Percy
Hi Phil, Nice approach. How is the spooling directory source working for you? Any thoughts on how it could be improved? Mike On Tue, May 21, 2013 at 8:17 AM, Phil Scala phil.sc...@globalrelay.netwrote: Hi, ** ** Based on my use and understanding that setting “fileSuffix” is simpy the

Re: How to get a bad message out of the channel?

2013-05-22 Thread Mike Percy
plugin a 'failsafe' path to write messages to when they are missing that kind of data? --Matt On May 10, 2013, at 6:30 PM, Mike Percy mpe...@apache.org wrote: Hook up a HDFS sink to them that doesn't use %Y, %m, etc in the configured path. HTH, Mike On May 10, 2013, at 11:00 AM, Matt

Re: What does the file header mean ? Flume always add headers to file header

2013-05-22 Thread Mike Percy
You probably figured this out by now but those are Avro container files :) see http://avro.apache.org Regards Mike On Wed, May 15, 2013 at 3:06 AM, higkoohk higko...@gmail.com wrote: Maybe it make by 'tengine.sinks.hdfs4log.serializer = avro_event' , but still don't know why and howto ...

Re: Setting Hadoop-specific settings for the HDFS plugin?

2013-05-22 Thread Mike Percy
You can do it in your hdfs-site.xml file which Flume will pull in when it detects Hadoop from the environment. Mike On Wed, May 15, 2013 at 9:22 AM, Matt Wise m...@nextdoor.com wrote: How do I pass hadoop-specific configuration settings to the HDFS plugin in Flume 1.3.0? Specifically, I need

Re: AvroSource HTTP vs Netty with Python bindings..

2013-05-22 Thread Mike Percy
Yep still true. There is a Thrift source on trunk though, also consider the HTTP source for integration with Python. Mike On Wed, May 8, 2013 at 1:25 PM, Matt Wise m...@nextdoor.com wrote: It seems like the current Python Avro package does not support the Flume-NG AvroSource... Is this still

Re: Spooling fileSuffix attribute ignored

2013-05-22 Thread Mike Percy
a workaround. Thanks a lot. De: Mike Percy mpe...@apache.org Responder a: Flume User List user@flume.apache.org Fecha: miércoles, 22 de mayo de 2013 09:35 Para: Flume User List user@flume.apache.org Asunto: Re: Spooling fileSuffix attribute ignored Hi Phil, Nice approach. How is the spooling

Re: Checking channel size.

2013-05-22 Thread Mike Percy
You can attach to the process locally via JMX and pull the metric from there. I'm not sure how to do it via the command line though. Mike On Wed, May 22, 2013 at 12:54 AM, Pranav Sharma pra...@compasslabs.comwrote: Is there a way to check the size of a channel either programmatically or

Re: Missing headers when using AVRO Sink/Source

2013-05-22 Thread Mike Percy
FYI there is a stock timestamp interceptor, if you want to use that. Mike On May 22, 2013, at 3:20 AM, ZORAIDA HIDALGO SANCHEZ zora...@tid.es wrote: Dear all, I made a custom interceptor in order to insert the timestamp header that is used by the HDFS sink. Firstly, I run an example

Re: Flume 1.4 release

2013-05-21 Thread Mike Percy
Hi Rahul, I think end of June is a little tight, usually it takes a while to do a release and we have not discussed it lately. I'd say early July is more likely. Let me start a discussion. Regards, Mike On Tue, May 21, 2013 at 10:49 AM, Rahul Ravindran rahu...@yahoo.com wrote: Hi, Is

Re: Expirience in using Apache Flume in OSGi environment

2013-05-20 Thread Mike Percy
Andrey, What is the use case? Can you provide more detail? Thanks, Mike On Sat, May 18, 2013 at 2:53 AM, Andrey Poltavtsev apoltavt...@gmail.comwrote: Hi, I did not found in existing Apache Flume distribution | documentation (User guide | Developers Guide) any information regarding using

Re: How to get a bad message out of the channel?

2013-05-10 Thread Mike Percy
Hook up a HDFS sink to them that doesn't use %Y, %m, etc in the configured path. HTH, Mike On May 10, 2013, at 11:00 AM, Matt Wise m...@nextdoor.com wrote: Eek, this was worse than I thought. Turns out message continued to be added to the channels, but no transactions could complete to take

Re: Flume service error from cloudera manager

2013-04-19 Thread Mike Percy
(bcc: user@flume.apache.org) Hi Madhu, Thanks for reaching out. The appropriate support channel for Cloudera Manager is the cdh-u...@cloudera.org email list. I have redirected your question there. Regards, Mike On Thu, Apr 18, 2013 at 9:01 PM, Madhusudhan Reddy Munagala

Re: How do I search the past posts for a topic?

2013-04-03 Thread Mike Percy
Jayashree, I like to use the search-hadoop.com site provided by Sematext: http://search-hadoop.com/?q=fc_project=Flume The logger is intended mainly for debugging. It will print data to the flume.log file itself. Regards, Mike On Sun, Mar 31, 2013 at 6:56 PM, JR mailj...@gmail.com wrote:

Re: Simple HDFS Sink file rolling question please.

2013-03-25 Thread Mike Percy
Hi Chris, Check out hdfs.idleTimeout parameter. Maybe set it to 5 minutes (i.e. hdfs.idleTimeout = 300) or something. http://flume.apache.org/FlumeUserGuide.html Regards, Mike On Thu, Mar 21, 2013 at 1:21 PM, Chris Neal cwn...@gmail.com wrote: Hi :) I have an ExecSource running a tail -F

Re: Parameters in Configuration File

2013-03-15 Thread Mike Percy
Hey Connor, Take a look at the discussion @ https://issues.apache.org/jira/browse/FLUME-1941 I you want to help work on this you are more than welcome to :) Regards, Mike On Thu, Mar 14, 2013 at 5:09 PM, Connor Woodson cwoodson@gmail.comwrote: Does the Flume configuration file support

Re: Parameters in Configuration File

2013-03-15 Thread Mike Percy
of boilerplate logic in the various implementations of Configurable.configure(Context) that is just completely unnecessary. In addition, you wouldn't have to spend time building something like property substitution that's been done many times before. From: Mike Percy mpe...@apache.org Reply

Re: Writing to HDFS from multiple HDFS agents (separate machines)

2013-03-14 Thread Mike Percy
Hi Gary, All the suggestions in this thread are good. Something else to consider is that adding multiple HDFS sinks pulling from the same channel is a recommended practice to maximize performance (competing consumers pattern). In that case, not only would it be a good idea to put the data into

Re: Flume secure communication

2013-03-12 Thread Mike Percy
No network encryption support yet but there is a patch up at https://issues.apache.org/jira/browse/FLUME-997 for this functionality. You are welcome to take a look and provide any comments. Not sure what you mean by #2, you would have to share more about your requirements / use case. Regards,

Re: Flume secure communication

2013-03-12 Thread Mike Percy
nobody had worked on it. Regards, Mike On Tue, Mar 12, 2013 at 12:41 PM, Mike Percy mpe...@apache.org wrote: It's certainly possible to sniff the wire traffic using some tool like WireShark. Regards, Mike Sent from my iPhone On Mar 12, 2013, at 5:29 AM, Deepak Tiwari dtiwari...@gmail.com

Re: Flume Ng replaying events when the source is idle

2013-03-04 Thread Mike Percy
Sagar, Just try tail -F on the same file over and over on the command line. It will display the last few lines. If you want to avoid this, try tail -F -n 0 filename and you should not see this. Every time you reload your configuration file, the specified command is re-executed by the source.

Re: It's better not to use thrift?

2013-02-20 Thread Mike Percy
Hari has done recent work on a modern Thrift RPC implementation. The existing impl. is there for legacy purposes and does not have a batch append() call so it turns out to be quite slow. Have you considered using the HTTP source? With decent batch sizes and keep-alive the performance might be

Re: Analysis of Data

2013-02-08 Thread Mike Percy
of flume which then converts the data and runs quick analysis on data in memory and update the global counters kind of things which then can be sink to live reporting systems. Thanks, Nitin On Fri, Feb 8, 2013 at 2:26 PM, Mike Percy mpe...@apache.org wrote: Nitin, Good to hear more

Re: How to load zip file into hdfs sink using flume-ng

2013-02-08 Thread Mike Percy
Actually it might be tricky to use the directory spooling source to read a compressed archive. It's possible, but you would definitely need to write your own deserializer. Flume is an event-oriented streaming system, it's not really optimized to be a plain file transfer mechanism like FTP.

Re: Analysis of Data

2013-02-07 Thread Mike Percy
Let's take this conversation further. What is missing? On Thu, Feb 7, 2013 at 2:39 AM, Inder Pall inder.p...@gmail.com wrote: flume is a platform to get events to the right sink (HDFS, local-file, ) analytics is not something which falls in it's territory - Inder On Thu, Feb 7, 2013

Re: Flume NG and zookeeper

2013-02-07 Thread Mike Percy
Integrate in what way? On Thu, Feb 7, 2013 at 6:36 PM, 吳瑞琳 rlwu...@gmail.com wrote: Hi all, I am trying to integrate Flume NG and zookeeper. However, I did not find any configuration about this in Flume NG. Could you please advise how to deal with this? Thanks, RL

Re: Analysis of Data

2013-02-07 Thread Mike Percy
to downstream agents who can do the heavy processing, it seems to me that this requirement is easily fulfilled by Flume. Regards, Mike On Thu, Feb 7, 2013 at 4:29 PM, Mike Percy mpe...@apache.org wrote: Let's take this conversation further. What is missing? On Thu, Feb 7, 2013 at 2:39 AM, Inder

Re: Flume NG and zookeeper

2013-02-07 Thread Mike Percy
Zookeeper node when it is up. Regards, RL 2013/2/8 Mike Percy mpe...@apache.org Integrate in what way? On Thu, Feb 7, 2013 at 6:36 PM, 吳瑞琳 rlwu...@gmail.com wrote: Hi all, I am trying to integrate Flume NG and zookeeper. However, I did not find any configuration about this in Flume NG

Re: Flume and JMX

2013-02-06 Thread Mike Percy
What exactly were you looking for? On Wed, Feb 6, 2013 at 7:35 AM, matt.elli...@gdc4s.com wrote: I am looking to deploy and manage multiple Flume deployment through JMX. Besides the JMXPollUtil does Flume have an hooks that would enable this?** ** ** ** Thanks, ** ** Matt

Re: Flume and JMX

2013-02-06 Thread Mike Percy
. What I’m wondering is if flume provides JMX type services like your standard app containers like jboss, web sphere, etc. If not can we make use of the mbeans that are already there? ** ** *From:* Mike Percy [mailto:mpe...@apache.org] *Sent:* Wednesday, February 06, 2013 2:14 PM *To:* user

Re: Flume and JMX

2013-02-06 Thread Mike Percy
, Mike On Wed, Feb 6, 2013 at 11:43 AM, matt.elli...@gdc4s.com wrote: Deploy, delete, start, stop, update configuration, restart, etc ** ** *From:* Mike Percy [mailto:mpe...@apache.org] *Sent:* Wednesday, February 06, 2013 2:38 PM *To:* user@flume.apache.org *Subject:* Re: Flume and JMX

Re: Flume in Windows?

2013-02-06 Thread Mike Percy
I know of someone who does this. They wrote their own startup scripts and stuff. Regards Mike On Wed, Feb 6, 2013 at 5:34 AM, venkatramanan venkatraman...@smartek21.comwrote: Hi, Am new in apache flume. Is there any possible to run the flume agent in windows 7. please advise thanks,

Re: SpoolDir marks item as completed, when sink fails

2013-02-05 Thread Mike Percy
will be done to resend the whole data? Just trying to grasp the basics On Fri, Feb 1, 2013 at 4:56 AM, Mike Percy mpe...@apache.orgjavascript:_e({}, 'cvml', 'mpe...@apache.org'); wrote: Tzur, that is expected, because the data is committed by the source onto the channel. Sources

Re: Authentication - Avro Source, Sink, RpcClient

2013-02-05 Thread Mike Percy
** ** *From:* Mike Percy [mailto:mpe...@apache.org javascript:_e({}, 'cvml', 'mpe...@apache.org');] *Sent:* Wednesday, January 23, 2013 9:16 PM *To:* user@flume.apache.org javascript:_e({}, 'cvml', 'user@flume.apache.org'); *Subject:* Re: Authentication - Avro Source, Sink, RpcClient

Re: Authentication - Avro Source, Sink, RpcClient

2013-02-05 Thread Mike Percy
** ** *From:* Mike Percy [mailto:mpe...@apache.org javascript:_e({}, 'cvml', 'mpe...@apache.org');] *Sent:* Tuesday, February 05, 2013 9:33 AM *To:* user@flume.apache.org javascript:_e({}, 'cvml', 'user@flume.apache.org'); *Subject:* Re: Authentication - Avro Source, Sink, RpcClient

Re: SpoolDir marks item as completed, when sink fails

2013-02-01 Thread Mike Percy
Tzur, that is expected, because the data is committed by the source onto the channel. Sources and sinks are decoupled, they only interact via the channel, which buffers the data and serves to mitigate impedance mismatches. On Thu, Jan 31, 2013 at 2:35 PM, Tzur Turkenitz tz...@vision.bi wrote:

Re: Flume-NG : Spooling dir source : java.io.IOException: Stream closed

2013-01-27 Thread Mike Percy
bcc: cdh-u...@cloudera.org No version of CDH currently ships with Flume 1.3.1, so redirecting this question to the user@flume.apache.org user list. Regards, Mike On Sun, Jan 27, 2013 at 8:56 PM, NGuyen thi Kim Tuyen tuyen03a...@gmail.com wrote: I'm using Flume-Ng 1.3.1 . Vào 11:33:49 UTC+7

Re: Flume-NG 1.3.1 : Spooling dir source : java.io.IOException: Stream closed

2013-01-27 Thread Mike Percy
Hi Nguyễn, The spooling source only works on done, immutable files. So they have to be atomically moved and they cannot be modified after being placed into the spooling directory. Regards, Mike On Sun, Jan 27, 2013 at 11:14 PM, NGuyen thi Kim Tuyen tuyen03a...@gmail.com wrote: Hi , Please

Re: HDFS Test Failure

2013-01-25 Thread Mike Percy
Seems strange. Connor have you tried running mvn clean install and do you get the same results? Flume is weird because we push SNAPSHOT builds per commit so you have to install to avoid strange dependency issues sometimes. It's especially insidious to do mvn clean package. I don't know if it's

Re: flume-cassandra

2013-01-24 Thread Mike Percy
. On Thu, Jan 24, 2013 at 10:43 AM, Mike Percy mpe...@cloudera.com wrote: What do you mean by collector? On Wed, Jan 23, 2013 at 9:05 PM, Sri Ramya ramya.1...@gmail.com wrote: Thank you very much. But I need a collector in my application, flume-ng does not have any collector. Thats why i

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
What version of Flume are you using? Are you using Maven for your build? You should be able to get away with just flume-ng-core. On Wed, Jan 23, 2013 at 10:02 AM, yogender nerella ynere...@gmail.comwrote: Hi, I would like to make my app directly write events to an flume agent. What are

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
the same issue. Yogi On Wed, Jan 23, 2013 at 11:36 AM, Mike Percy mpe...@apache.org wrote: What version of Flume are you using? Are you using Maven for your build? You should be able to get away with just flume-ng-core. On Wed, Jan 23, 2013 at 10:02 AM, yogender nerella ynere

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
) at org.apache.flume.api.RpcClientFactory.getDefaultInstance(RpcClientFactory.java:128) at org.apache.flume.clients.log4jappender.Log4jAppender.activateOptions(Log4jAppender.java:184) Appreciate your help, Yogi On Wed, Jan 23, 2013 at 11:54 AM, Mike Percy mpe...@apache.org wrote: Yogi, Flume has lots of dependencies. You

Re: what are the libraries needed for flume log4jappender

2013-01-23 Thread Mike Percy
it only needs flume-ng-sdk.jar file. In that case, if I want to ship flume log4jappender, should I have to ship all these jar files in flume/lib directory? Yogi On Wed, Jan 23, 2013 at 12:08 PM, Mike Percy mpe...@apache.org wrote: I don't use Eclipse but my understanding is that mvn

Re: Can we treat a whole file as a Flume event?

2013-01-23 Thread Mike Percy
Yep my bad, typo :) On Wed, Jan 23, 2013 at 1:04 PM, Roshan Naik ros...@hortonworks.com wrote: Thats SpoolDirectorySource.java .. i thought you referred to SpoolingFileSource earlier. i assume that was a typo ? On Wed, Jan 23, 2013 at 11:53 AM, Mike Percy mpe...@apache.org wrote

Re: flume-cassandra

2013-01-23 Thread Mike Percy
collector with cassandra. If any body tried it before please help me. thank in advance. On Thu, Jan 24, 2013 at 10:26 AM, Mike Percy mpe...@cloudera.com wrote: Hi Sri, Cloudera originally created Flume, then contributed it to the Apache Software Foundation (ASF), and continues to invest

Re: Reliability in Flume

2013-01-23 Thread Mike Percy
Henry, Please see inline... On Wed, Jan 23, 2013 at 7:26 PM, Henry Ma henry.ma.1...@gmail.com wrote: Dear Flume developers and users, I understand that Flume NG uses channel-based transactions to guarantee reliable message delivery between agents. But in some extreme failure scenes, will

Re: Uncaught Exception When Using Spooling Directory Source

2013-01-18 Thread Mike Percy
, and then archive to HDFS each hour. By now the log files cannot be pushed to any collecting system. We want to the collecting system can PULL all of them remotely. Can you give me some guide? Thanks! On Fri, Jan 18, 2013 at 3:45 PM, Mike Percy mpe...@apache.org wrote: Can you provide more detail

Re: Uncaught Exception When Using Spooling Directory Source

2013-01-17 Thread Mike Percy
Hi Henry, The files must be immutable before putting them into the spooling directory. So if you copy them from a different file system then you can run into this issue. The right way to do it is to copy them to the same file system and then atomically move them into the spooling directory.

Re: Uncaught Exception When Using Spooling Directory Source

2013-01-17 Thread Mike Percy
to design the architecture? Which type of source and sink can fit? Thanks! On Fri, Jan 18, 2013 at 2:05 PM, Mike Percy mpe...@apache.org wrote: Hi Henry, The files must be immutable before putting them into the spooling directory. So if you copy them from a different file system then you can

New blog post on Flume performance tuning

2013-01-11 Thread Mike Percy
Hi folks, I just posted to the Apache blog on how to do performance tuning with Flume. I plan on following it up with a post about using the Flume monitoring capabilities while tuning. Feedback is welcome. https://blogs.apache.org/flume/entry/flume_performance_tuning_part_1 Regards, Mike

Re: New blog post on Flume performance tuning

2013-01-11 Thread Mike Percy
Thanks Brock! I've been working on this, off and on, for a while. :) On Fri, Jan 11, 2013 at 12:18 PM, Brock Noland br...@cloudera.com wrote: Nice post! On Fri, Jan 11, 2013 at 12:13 PM, Mike Percy mpe...@apache.org wrote: Hi folks, I just posted to the Apache blog on how to do

Re: New blog post on Flume performance tuning

2013-01-11 Thread Mike Percy
, Tariq https://mtariq.jux.com/ On Sat, Jan 12, 2013 at 2:15 AM, Mike Percy mpe...@apache.org wrote: Thanks Brock! I've been working on this, off and on, for a while. :) On Fri, Jan 11, 2013 at 12:18 PM, Brock Noland br...@cloudera.com wrote: Nice post! On Fri, Jan 11, 2013

Re: [ANNOUNCE] Apache Flume 1.3.1 released

2013-01-04 Thread Mike Percy
Hari, Thanks for taking care of this release! Well done! Regards, Mike On Wed, Jan 2, 2013 at 3:53 PM, Hari Shreedharan hshreedha...@apache.orgwrote: The Apache Flume team is pleased to announce the release of Flume version 1.3.1. Flume is a distributed, reliable, and available service for

Re: A customer use case

2012-12-04 Thread Mike Percy
Hi Emile, On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao emile...@gmx.net wrote: 1. Which is the best way to implement such a scenario using Flume/ Hadoop? You could use the file spooling client / source to stream these files back in the latest trunk and upcoming Flume 1.3.0 builds, along with

Re: .tmp in hdfs sink

2012-11-20 Thread Mike Percy
1.3.0 is out? I am currently using the snapshot version of 1.3.0 On Tue, Nov 20, 2012 at 11:16 AM, Mike Percy mpe...@apache.org wrote: Mohit, FLUME-1660 is now committed and it will be in 1.3.0. In the case where you are using 1.2.0, I suggest running with hdfs.rollInterval set so

Re: Netcat source stops processing data

2012-11-20 Thread Mike Percy
Rahul, A patch and a unit test to add this as an option would be greatly appreciated! There is already a JIRA open for this: https://issues.apache.org/jira/browse/FLUME-1713 Regards, Mike On Tue, Nov 20, 2012 at 3:20 PM, Rahul Ravindran rahu...@yahoo.com wrote: Pinging on this slightly old

Re: .tmp in hdfs sink

2012-11-15 Thread Mike Percy
Hi Mohit, this is a complicated issue. I've filed https://issues.apache.org/jira/browse/FLUME-1714 to track it. In short, it would require a non-trivial amount of work to implement this, and it would need to be done carefully. I agree that it would be better if Flume handled this case more

Re: [ANNOUNCE] New Apache Flume committer - Patrick Wendell

2012-11-13 Thread Mike Percy
Patrick, welcome! Great to have you on board. Regards, Mike On Mon, Nov 12, 2012 at 1:04 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: On behalf of the Apache Flume PMC, I am excited to welcome Patrick Wendell as a committer on Flume! Patrick has contributed significantly to the

Re: SNMP Source

2012-11-10 Thread Mike Percy
Hi Simon, Nothing that I know of. Of course, contributions are welcome! :) Regards, Mike On Fri, Nov 9, 2012 at 3:04 AM, Simon Monecke simonmone...@gmail.comwrote: Hi, is there any solutions to receive SNMP-Logs with flume? Regards, Simon

Re: Flume bz2 issue while processing by a map reduce job

2012-11-02 Thread Mike Percy
Hi Jagadish, My understanding based on investigating this issue over the last couple of days is that MapReduce jobs will only read the first section of a concatenated bzip2 file. I believe you are correct that https://issues.apache.org/jira/browse/HADOOP-6852 is the only way to solve this issue,

Re: Thread safety in RpcClientFactory

2012-10-29 Thread Mike Percy
Yes, it's thread safe. Regards Mike On Wed, Oct 24, 2012 at 9:02 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Is object returned by rpcClient = RpcClientFactory.getDefaultInstance(hostName, port); thread safe? I am currently sharing rpcClient object with several threads.

Re: Flume NG Source for Avro data file

2012-10-17 Thread Mike Percy
If this is for streaming Avro files off the local filesystem, then it's in scope for https://issues.apache.org/jira/browse/FLUME-1633 Regards Mike On Wed, Oct 17, 2012 at 7:46 AM, Brock Noland br...@cloudera.com wrote: Hi, Just noticed this while cleaning my inbox. Sorry for the late reply.

Re: Errors

2012-10-11 Thread Mike Percy
You should consider how your system will act if there is a downstream failure. Even a capacity of 500 is extremely (orders of magnitude) too small in my opinion. Consider setting a channel capacity equal to (average events per second ingested * # of seconds downtime you want to tolerate). So if

  1   2   >