[ANNOUNCE] Change of Apache Flume PMC Chair
Dear Flume Users and Developers, I have had the pleasure of serving as the PMC Chair of Apache Flume since its graduation three years ago. I sincerely thank you and the Flume PMC for this opportunity. However, I have decided to step down from this responsibility due to personal reasons. I am very happy to announce that on the request of Flume PMC and with the approval from the board of directors at The Apache Software Foundation, Hari Shreedharan is hereby appointed as the new PMC Chair. I am confident that Hari will do everything possible to help further grow the community and adoption of Apache Flume. Please join me in congratulating Hari on his appointment and welcoming him to this role. Regards, Arvind Prabhakar
Re: contributing to flume
Hi Eran, I added you to the contributors list and assigned the Jira to you. You should be able to make the status changes yourself. Regards, Arvind Prabhakar On Sat, Sep 26, 2015 at 4:20 AM, IT CTO <goi@gmail.com> wrote: > Hi, > I am new to flume and want to be able to contribute to flume. > I opened a jira issue (flume-2802) but I can't assign it to me so I can't > change the status to get the patch reviewed. > Can someone help here? > > Eran > -- > Eran | "You don't need eyes to see, you need vision" (Faithless) >
[jira] [Assigned] (FLUME-2802) Folder name interceptor
[ https://issues.apache.org/jira/browse/FLUME-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned FLUME-2802: --- Assignee: Eran W > Folder name interceptor > --- > > Key: FLUME-2802 > URL: https://issues.apache.org/jira/browse/FLUME-2802 > Project: Flume > Issue Type: New Feature >Reporter: Eran W >Assignee: Eran W > Attachments: FLUME-2802.patch > > > This interceptor retrieve the last folder name from the > SpoolDir.fileHeaderKey and set it to the given folderKey. > This is allow users to set the target hdfs directory based on the source > directory and not the whole path or file name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Flume version 1.6.0 RC1
Thanks Rufus for helping with this release. My vote: +1 * Verified signatures and checksums * Top level files look good (Nit: The NOTICE file copyright statement says 2012 instead of 2015) * The tag looks good (Nit: the DOAP file is not included in the source tar ball) * RAT test passes with configured exclusions (checked via mvn verify) Regards, Arvind Prabhakar On Sun, May 17, 2015 at 5:50 AM, 李响 wate...@gmail.com wrote: +1 All test cases pass using the latest OpenJDK 1.7.0_79. Thanks Johny!! On Thu, May 14, 2015 at 9:44 PM, Ashish paliwalash...@gmail.com wrote: +1 Build works good, all test cases pass Randomly picked few JIRA's and validate the commits looks good Thank You Johny for all the hard work. On Tue, May 12, 2015 at 11:13 PM, Johny Rufus jru...@cloudera.com wrote: Hi All, This is the ninth release for Apache Flume as a top-level project, version 1.6.0. We are voting on release candidate RC1. It fixes the following issues: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob;f=CHANGELOG;h=53ea45cbd496b89fcd84c89f2ebd8d51e5bb8016;hb=f7560038a25430378f09ea631b6e472979d7988c *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: http://people.apache.org/~hshreedharan/apache-flume-1.6.0-rc1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-1016/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=f7560038a25430378f09ea631b6e472979d7988c Flume's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/flume/KEYS Thanks, Rufus -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal -- 李响 手机 cellphone :+86-1368-113-8972 E-mail :wate...@gmail.com MSN :wate...@hotmail.com
Re: Flume performance measurements
Done. Please let me know if you run into any issues. Regards, Arvind On Wed, Apr 8, 2015 at 3:58 PM, Roshan Naik ros...@hortonworks.com wrote: roshan_naik is my login to cwiki.apache.org On 4/8/15 3:55 PM, Arvind Prabhakar arv...@apache.org wrote: Added Hari to the wiki. Roshan, I could not look you up on the wiki users, can you please tell me your username? If you don't have one yet, please register and let me know. Regards, Arvind Prabhakar On Wed, Apr 8, 2015 at 3:26 PM, Roshan Naik ros...@hortonworks.com wrote: Arvind, Please do let me know once you have granted me permission to the wiki. -roshan From: Hari Shreedharan hshreedha...@cloudera.commailto: hshreedha...@cloudera.com Date: Thursday, April 2, 2015 3:06 PM To: Roshan Naik ros...@hortonworks.commailto:ros...@hortonworks.com Cc: dev@flume.apache.orgmailto:dev@flume.apache.org dev@flume.apache.orgmailto:dev@flume.apache.org Subject: Re: Flume performance measurements Arvind - please could you grant Roshan access to the wiki. Thanks, Hari On Thu, Apr 2, 2015 at 3:04 PM, Roshan Naik ros...@hortonworks.com mailto:ros...@hortonworks.com wrote: Could u grant me write access to wiki ? username: roshannaik On 4/2/15 2:53 PM, Hari Shreedharan hshreedha...@cloudera.com mailto: hshreedha...@cloudera.com wrote: Roshan, Could you update the performance measurements page on our wiki with this info? That would be more useful to reference. Thanks, Hari On Thu, Apr 2, 2015 at 2:34 PM, Roshan Naik ros...@hortonworks.com mailto:ros...@hortonworks.com wrote: Sample Flume v1.4 Measurements for reference: Here are some sample measurements taken with a single agent and 500 byte events. Cluster Config: 20-node Hadoop cluster (1 name node and 19 data nodes). Machine Config: 24 cores - Xeon E5-2640 v2 @ 2.00GHz, 164 GB RAM. 1. File channel with HDFS Sink (Sequence File): Source: 4 x Exec Source, 100k batchSize HDFS Sink Batch size: 500,000 Channel: File Number of data dirs: 8 Events/Sec Sink Count 1 data dirs 2 data dirs 4 data dirs 6 data dirs 8 data dirs 10 data dirs 1 14.3 k 2 21.9 k 4 35.8 k 8 24.8 k 43.8 k 72.5 k 77 k 78.6 k 76.6 k 10 58 k 12 49.3 k 49 k Was looking for sweet spot in perf. So did not take measurements for all data points on grid. Only too for the ones that made sense. For example: when perf dropped by adding more sinks, did not take more measurements for those rows. 2. HDFS Sink: Channel: Memory # of HDFS Sinks Snappy BatchSz:1.2mill Snappy BatchSz:1.4mill Sequence File BatchSz:1.2mill 1 34.3 k 33 k 33 k 2 71 k 75 k 69 k 4 141 k 145 k 141 k 8 271 k 273 k 251 k 12 382 k 380 k 370 k 16 478 k 538 k 486 k Some simple observations : * increasing number of dataDirs helps FC perf even on single disk systems * Increasing number of sinks helps * Max throughput observed was about 538k events/sec for HDFS sink which is approx 240MB/s
Re: Flume performance measurements
Added Hari to the wiki. Roshan, I could not look you up on the wiki users, can you please tell me your username? If you don't have one yet, please register and let me know. Regards, Arvind Prabhakar On Wed, Apr 8, 2015 at 3:26 PM, Roshan Naik ros...@hortonworks.com wrote: Arvind, Please do let me know once you have granted me permission to the wiki. -roshan From: Hari Shreedharan hshreedha...@cloudera.commailto: hshreedha...@cloudera.com Date: Thursday, April 2, 2015 3:06 PM To: Roshan Naik ros...@hortonworks.commailto:ros...@hortonworks.com Cc: dev@flume.apache.orgmailto:dev@flume.apache.org dev@flume.apache.orgmailto:dev@flume.apache.org Subject: Re: Flume performance measurements Arvind - please could you grant Roshan access to the wiki. Thanks, Hari On Thu, Apr 2, 2015 at 3:04 PM, Roshan Naik ros...@hortonworks.com mailto:ros...@hortonworks.com wrote: Could u grant me write access to wiki ? username: roshannaik On 4/2/15 2:53 PM, Hari Shreedharan hshreedha...@cloudera.commailto: hshreedha...@cloudera.com wrote: Roshan, Could you update the performance measurements page on our wiki with this info? That would be more useful to reference. Thanks, Hari On Thu, Apr 2, 2015 at 2:34 PM, Roshan Naik ros...@hortonworks.com mailto:ros...@hortonworks.com wrote: Sample Flume v1.4 Measurements for reference: Here are some sample measurements taken with a single agent and 500 byte events. Cluster Config: 20-node Hadoop cluster (1 name node and 19 data nodes). Machine Config: 24 cores - Xeon E5-2640 v2 @ 2.00GHz, 164 GB RAM. 1. File channel with HDFS Sink (Sequence File): Source: 4 x Exec Source, 100k batchSize HDFS Sink Batch size: 500,000 Channel: File Number of data dirs: 8 Events/Sec Sink Count 1 data dirs 2 data dirs 4 data dirs 6 data dirs 8 data dirs 10 data dirs 1 14.3 k 2 21.9 k 4 35.8 k 8 24.8 k 43.8 k 72.5 k 77 k 78.6 k 76.6 k 10 58 k 12 49.3 k 49 k Was looking for sweet spot in perf. So did not take measurements for all data points on grid. Only too for the ones that made sense. For example: when perf dropped by adding more sinks, did not take more measurements for those rows. 2. HDFS Sink: Channel: Memory # of HDFS Sinks Snappy BatchSz:1.2mill Snappy BatchSz:1.4mill Sequence File BatchSz:1.2mill 1 34.3 k 33 k 33 k 2 71 k 75 k 69 k 4 141 k 145 k 141 k 8 271 k 273 k 251 k 12 382 k 380 k 370 k 16 478 k 538 k 486 k Some simple observations : * increasing number of dataDirs helps FC perf even on single disk systems * Increasing number of sinks helps * Max throughput observed was about 538k events/sec for HDFS sink which is approx 240MB/s
[jira] [Created] (FLUME-2564) Failover processor does not kick-in for HDFS sink on IOException
Arvind Prabhakar created FLUME-2564: --- Summary: Failover processor does not kick-in for HDFS sink on IOException Key: FLUME-2564 URL: https://issues.apache.org/jira/browse/FLUME-2564 Project: Flume Issue Type: Bug Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar From a recent thread on the user mailing list: {quote} I have investigated the HDFSEventSink source code, found if the exception was IOException , the exception would not throw to the upper layer, So FailOverSinkProcessor would not mark this sink as dead. {quote} {code} } catch (IOException eIO) { transaction.rollback(); LOG.warn(HDFS IO error, eIO); return Status.BACKOFF; } catch (Throwable th) { transaction.rollback(); LOG.error(process failed, th); if (th instanceof Error) { throw (Error) th; } else { throw new EventDeliveryException(th); } } {code} The failover processor should be able to use the backoff signal as indication of failure and switch over to the next sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache Flume 1.5.2 RC1
+1 * Verified checksums and signatures * Verified build Regards, Arvind Prabhakar On Fri, Nov 14, 2014 at 2:58 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: +1. - Verified signatures and checksums - Built and ran tests - Verified top-level files. Thanks, Hari On Thu, Nov 13, 2014 at 1:17 PM, Roshan Naik ros...@hortonworks.com wrote: +1 verified the code change -roshan On Wed, Nov 12, 2014 at 8:03 PM, Jarek Jarcec Cecho jar...@apache.org wrote: +1 * Verified checksums and signature files * Verified that each jar in binary tarball is in the license * Checked top level files (NOTICE, ...) * Run tests (pretty much the same email I’ve sent for 1.5.1 :)) Jarcec On Nov 12, 2014, at 1:15 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: This is the eighth release for Apache Flume as a top-level project, version 1.5.2. We are voting on release candidate RC1. This release fixes an incompatibility with Java 6 based clients found in Apache Flume 1.5.1 Release. It fixes the following issues: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob;f=CHANGELOG;h=cc7321361d0b702ba870de20d6a3d2106987186a;hb=229442aa6835ee0faa17e3034bcab42754c460f5 *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: *https://people.apache.org/~hshreedharan/apache-flume-1.5.2-rc1/ https://people.apache.org/~hshreedharan/apache-flume-1.5.2-rc1/* Maven staging repo: * https://repository.apache.org/content/repositories/orgapacheflume-1008/ https://repository.apache.org/content/repositories/orgapacheflume-1008/ * The tag to be voted on: * https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=229442aa6835ee0faa17e3034bcab42754c460f5 https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=229442aa6835ee0faa17e3034bcab42754c460f5 * Flume's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/flume/KEYS Thanks, Hari -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Flume 1.5.1 RC1
That sounds good to me. Thanks for working on this release Hari. Regards, Arvind Prabhakar On Mon, Nov 10, 2014 at 11:10 AM, Hari Shreedharan hshreedha...@cloudera.com wrote: It does not look like we ever actually included the dev-support directory in the source tarball (I checked 1.3.1,1.4.0 and 1.5.0.1). If we need a re-spin for another reason, I will try to fix the release process to pull this in and remove the iml files. Arvind - does that sound good to you? Otherwise I will spin another RC. Thanks, Hari On Sun, Nov 9, 2014 at 8:47 PM, Arvind Prabhakar arv...@apache.org wrote: +1 * Verified signatures * Verified checksums * Verified the tag (minor issues noted below - would be good to address if there is RC2) * Builds correctly * All tests run with default profile and avro version set to 1.7.5 (to avoid an issue with snappy on Mac OS) Nits: * The tag and sources match except that the src tarball contains the iml files and does not contain the dev-support directory. Since both the iml files and dev-support files are not related to product functionality, it is OK for the tarball to not include them. However, if there is a respin it would be good to address that. * It is time we updated the avro version in the system to a newer release, which among other things will allow people to build on Mac OS without running into the JDK7+Snappy 1.0.4 problem where tests because native library does not load. Regards, Arvind On Thu, Nov 6, 2014 at 3:17 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: This is the seventh release for Apache Flume as a top-level project, version 1.5.1. We are voting on release candidate RC1. It fixes the following issues: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob_plain;f=CHANGELOG;hb=c74804226bcee59823c0cbc09cdf803a3d9e6920 *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: https://people.apache.org/~hshreedharan/apache-flume-1.5.1-rc1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-1006/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=c74804226bcee59823c0cbc09cdf803a3d9e6920 Flume's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/flume/KEYS Thanks, Hari
Re: [VOTE] Release Apache Flume 1.5.1 RC1
+1 * Verified signatures * Verified checksums * Verified the tag (minor issues noted below - would be good to address if there is RC2) * Builds correctly * All tests run with default profile and avro version set to 1.7.5 (to avoid an issue with snappy on Mac OS) Nits: * The tag and sources match except that the src tarball contains the iml files and does not contain the dev-support directory. Since both the iml files and dev-support files are not related to product functionality, it is OK for the tarball to not include them. However, if there is a respin it would be good to address that. * It is time we updated the avro version in the system to a newer release, which among other things will allow people to build on Mac OS without running into the JDK7+Snappy 1.0.4 problem where tests because native library does not load. Regards, Arvind On Thu, Nov 6, 2014 at 3:17 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: This is the seventh release for Apache Flume as a top-level project, version 1.5.1. We are voting on release candidate RC1. It fixes the following issues: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob_plain;f=CHANGELOG;hb=c74804226bcee59823c0cbc09cdf803a3d9e6920 *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: https://people.apache.org/~hshreedharan/apache-flume-1.5.1-rc1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-1006/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=c74804226bcee59823c0cbc09cdf803a3d9e6920 Flume's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/flume/KEYS Thanks, Hari
Re: What Source/Sink would you want next?
(cross-posting this to dev@) While I do not speak for the availability of other committers of the project, I would like to spend sometime with the contributors to help identify what are the most important needs of the project, and see how best we can get those committed into the codebase. Santiago (and others who would like to contribute) - please go ahead and create the necessary Jiras if they do not exist already, and invite the community to vote on those. That way we can prioritize the review and commit for functionality that is aligned with community requirements. Regards, Arvind Prabhakar On Fri, Sep 26, 2014 at 5:13 AM, jean garutti lagaru...@yahoo.fr wrote: hi This seems to be great. I'll wait to have the 'production ready' flag for ELS mapping patch. I think more effort should be done to have this sink more configurable like what we can do with logstash. anyway it's nice to share your development to the community i'd love to have the mongodb sink packaged in the official flume release. jean Le Jeudi 25 septembre 2014 9h48, Santiago Mola sm...@stratio.com a écrit : Hi Jean, 2014-09-24 22:44 GMT+02:00 Jean lagaru...@yahoo.fr: A solid mongodb source would be Nice. Definitely! I wish the same for elasticsearch sink where we could specify the mapping for the headers instead of sending everything as a string We have a serializer that creates mappings for ElasticSearch [1]. It is not ready for production [2] but it is one of our priorities. [1] https://github.com/Stratio/stratio-ingestion/tree/develop/stratio-serializers/stratio-elasticsearch-serializer [2] https://github.com/Stratio/stratio-ingestion/issues/21 Thanks for your feedback, -- Santiago M. Mola http://www.stratio.com/ Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: +34 91 352 59 42 // *@stratiobd https://twitter.com/StratioBD*
[jira] [Commented] (FLUME-2365) Please create a DOAP file for your TLP
[ https://issues.apache.org/jira/browse/FLUME-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032013#comment-14032013 ] Arvind Prabhakar commented on FLUME-2365: - [~hshreedharan] - I updated the files.xlm in the site repository. Once the project shows up correctly on http://projects.apache.org/indexes/alpha.html#F we can go ahead and close this Jira out. Please create a DOAP file for your TLP -- Key: FLUME-2365 URL: https://issues.apache.org/jira/browse/FLUME-2365 Project: Flume Issue Type: Task Reporter: Sebb Assignee: Ashish Paliwal Attachments: flume.rdf As per my recent e-mail to your dev list, please can you set up a DOAP for your project and get it added to files.xml? Please see http://projects.apache.org/create.html Once you have created the DOAP and committed it to your source code repository, please submit it for inclusion in the Apache projects listing as per: http://projects.apache.org/create.html#submit Remember, if you ever move or rename the doap file in future, please ensure that files.xml is updated to point to the new location. Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252)
ApacheCon CFP closes June 25
Dear Flume enthusiast, As you may be aware, ApacheCon will be held this year in Budapest, on November 17-23. (See http://apachecon.eu for more info.) The Call For Papers for that conference is still open, but will be closing soon. We need you talk proposals, to represent Flume at ApacheCon. We need all kinds of talks - deep technical talks, hands-on tutorials, introductions for beginners, or case studies about the awesome stuff you're doing with Flume. Please consider submitting a proposal, at http://events.linuxfoundation.org//events/apachecon-europe/program/cfp Thanks, Arvind Prabhakar
Re: [VOTE] Apache Flume 1.5.0.1 RC1
+1 Thanks for shepherding this Hari. Regards, Arvind Prabhakar On Tue, Jun 10, 2014 at 3:40 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: This is a vote for the next release of Apache Flume, version 1.5.0.1. We are voting on release candidate RC1. It fixes the following issues: http://s.apache.org/v7X *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: https://people.apache.org/~hshreedharan/apache-flume-1.5.0.1-rc1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-1004/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=ceda6aa1126a01370641caf729d8b1dd6d80aa61 Flume's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/flume/KEYS Thanks, Hari
Re: [VOTE] Apache Flume 1.5.0 RC1
+1 * Verified signatures and checksums for both binary and source tarballs * Rat check looks good on source tarball * Nit: Notice file has dated header, needs to be updated but not a blocker Regards, Arvind Prabhakar On Wed, May 7, 2014 at 3:28 PM, Hari Shreedharan hshreedha...@cloudera.comwrote: This is a vote for the next release of Apache Flume, version 1.5.0. We are voting on release candidate RC1. It fixes the following issues: http://s.apache.org/4eQ *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: https://people.apache.org/~hshreedharan/apache-flume-1.5.0-rc1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-1001/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=8633220df808c4cd0c13d1cf0320454a94f1ea97 Flume's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/flume/KEYS Thanks, Hari
Re: [DISCUSS] Release Flume 1.5.0
Thanks for bringing this up Hari. A new release for Flume is overdue in my opinion. Regards, Arvind Prabhakar On Thu, Jan 30, 2014 at 9:43 AM, Chiwan Park chiwanp...@icloud.com wrote: +1 on new release! -- Regards, Chiwan Park On Jan 31, 2014, at 2:17 AM, Hari Shreedharan hshreedha...@cloudera.com wrote: Hi folks, It has been about 6 months since we did a release. We have added several new features and fixed a lot of bugs. What do you guys think about releasing Flume 1.5.0? Thanks Hari
Re: Phoenix- Hbase Sink
Apologies for the delay Ravi and Hari - last time I tried to add you the Wiki was being upgraded and was not ready. I have now added you in and you should be able to see the edit button. Regards, Arvind Prabhakar On Thu, Dec 19, 2013 at 5:18 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: Arvind - Could you please give Ravi edit privileges. I don’t seem to have access. Thanks, Hari On Thursday, December 19, 2013 at 5:13 PM, Ravi Kiran wrote: Hi Hari, Can you please grant me permissions to update the WIKI to have pointers to Phoenix . Regards Ravi On Sun, Dec 15, 2013 at 7:25 AM, Ravi Kiran maghamraviki...@gmail.com(mailto: maghamraviki...@gmail.com) wrote: Hi Hari, Its maghamravikiran Thanks Ravi. On Sat, Dec 14, 2013 at 7:03 AM, Hari Shreedharan hshreedha...@cloudera.com (mailto:hshreedha...@cloudera.com) wrote: +dev@ Hi Ravi, Can you please send your confluence (wiki) login id? Thanks, Hari On Thursday, December 12, 2013 at 4:24 AM, Ravi Kiran wrote: Hi Hari, I don't seem to have permissions to edit the page. Can you please grant me permissions. Regards Ravi On Thu, Dec 12, 2013 at 10:40 AM, Hari Shreedharan hshreedha...@cloudera.com (mailto:hshreedha...@cloudera.com) wrote: Hi Ravi, Thanks for the information. You could post a link to this on the wiki here: https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Plugins for users to be able to find it. Thanks, Hari On Wednesday, December 11, 2013 at 8:26 PM, Ravi Kiran wrote: Hi all, The Apache Phoenix project now provides a custom sink for streaming Flume events into HBase. These events may be queried through SQL using the Phoenix JDBC driver. The detailed instructions can be found here (still on github until we move to Apache): https://github.com/forcedotcom/phoenix/wiki/Apache-Flume-Plugin. Regards Ravi
[jira] [Commented] (FLUME-2191) HDFS Minicluster tests failing after protobuf upgrade.
[ https://issues.apache.org/jira/browse/FLUME-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785791#comment-13785791 ] Arvind Prabhakar commented on FLUME-2191: - +1 changes look good to me. Will commit after a sanity run. HDFS Minicluster tests failing after protobuf upgrade. -- Key: FLUME-2191 URL: https://issues.apache.org/jira/browse/FLUME-2191 Project: Flume Issue Type: Bug Reporter: Hari Shreedharan Assignee: Hari Shreedharan Priority: Blocker Attachments: FLUME-2191.patch I ran the full build in hadoop-1 profile, but it looks like the protobuf upgrade broke the hadoop-2 profile. The HDFS Sink test on Minicluster fails with this: {code} Running org.apache.flume.sink.hdfs.TestHDFSEventSinkOnMiniCluster 2013-09-13 12:11:31.159 java[58566:1203] Unable to load realm info from SCDynamicStore 2013-09-13 12:11:31.208 java[58566:1203] Unable to load realm info from SCDynamicStore Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 4.238 sec FAILURE! simpleHDFSTest(org.apache.flume.sink.hdfs.TestHDFSEventSinkOnMiniCluster) Time elapsed: 1979 sec ERROR! java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses. at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetDatanodeReportRequestProto.getSerializedSize(ClientNamenodeProtocolProtos.java:21638) at com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.constructRpcRequest(ProtobufRpcEngine.java:137) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:181) at com.sun.proxy.$Proxy15.getDatanodeReport(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:84) at com.sun.proxy.$Proxy15.getDatanodeReport(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:488) at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:1642) at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:1703) at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:1722) at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1066) at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:929) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:588) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:527) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:398) at org.apache.flume.sink.hdfs.TestHDFSEventSinkOnMiniCluster.simpleHDFSTest(TestHDFSEventSinkOnMiniCluster.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222
[jira] [Assigned] (FLUME-2199) Flume builds with new version require mvn install before site can be generated
[ https://issues.apache.org/jira/browse/FLUME-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned FLUME-2199: --- Assignee: Andrew Bayer Flume builds with new version require mvn install before site can be generated -- Key: FLUME-2199 URL: https://issues.apache.org/jira/browse/FLUME-2199 Project: Flume Issue Type: Bug Components: Build Affects Versions: v1.4.0 Reporter: Andrew Bayer Assignee: Andrew Bayer Fix For: v1.5.0 Attachments: FLUME-2199.patch At this point, if you change the version for Flume, you need to run a mvn install before you can run with -Psite (or, for that matter, javadoc:javadoc) enabled. This is because the top-level POM in flume.git/pom.xml is both the parent POM and the root of the reactor - since it's the parent, it's got to run before any of the children that inherit from it, but site generation should be running *after* all the children, so that it probably pulls in the reactor's build of each child module, rather than having to pull in one already installed/deployed before the build starts. There are a bunch of other reasons to split parent POM and top-level POM, but that's the biggest one right there. Also, the javadoc jar generation is a bit messed up - every module's javadoc jar contains not only its own javadocs but the javadocs for every Flume module it depends on. That, again, may make sense in a site context for the top-level, but not for the individual modules. This results in unnecessary bloat in the javadoc jars, and unnecessary time spent downloading the *-javadoc-resources.jar for every dependency each module has, due to how the javadoc plugin works. Also the whole site generation per-module thing, which I am not a fan of in most cases. I don't think it's needed here. Tweaking the site plugin not to run anywhere but the top-level and the javadoc plugin to not do the dependency aggregation anywhere but the top-level should make a big difference on build speed. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [ANNOUNCE] New Flume Committer - Roshan Naik
Congratulations Roshan! Regards, Arvind Prabhakar On Tue, Sep 24, 2013 at 4:05 PM, Mike Percy mpe...@apache.org wrote: Congrats Roshan, welcome! Mike On Tue, Sep 24, 2013 at 3:47 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations Roshan, well done! Jarcec On Tue, Sep 24, 2013 at 03:39:13PM -0700, Hari Shreedharan wrote: On behalf of the Apache Flume PMC, I am excited to welcome Roshan Naik as a committer on the Apache Flume project. Roshan has actively contributed several patches to the Flume project, including bug fixes, Windows support and new features. Congratulations and Welcome, Roshan! Cheers, Hari Shreedharan
Re: [ANNOUNCE] New Flume Committer - Wolfgang Hoschek
Congratulations Wolfgang! Regards, Arvind Prabhakar On Tue, Sep 24, 2013 at 4:05 PM, Mike Percy mpe...@apache.org wrote: Congrats Wolfgang, and welcome! Mike On Tue, Sep 24, 2013 at 3:46 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations Wolfgang, well done! Jarcec On Tue, Sep 24, 2013 at 03:39:12PM -0700, Hari Shreedharan wrote: On behalf of the Apache Flume PMC, I am excited to welcome Wolfgang Hoschek as a committer on the Apache Flume project. Wolfgang contributed a new sink with the ability to do heavyweight ETL-style processing and writing to Apache Solr indices. Congratulations and Welcome, Wolfgang! Cheers, Hari Shreedharan
[jira] [Commented] (FLUME-2140) Support diverting bad events from pipeline
[ https://issues.apache.org/jira/browse/FLUME-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732732#comment-13732732 ] Arvind Prabhakar commented on FLUME-2140: - [Discussion thread|http://flume.markmail.org/thread/y3cks6hdgof3kxu6#query:+page:1+mid:rx3zm53t4dhmqskk+state:results] on this subject in the user-list for reference. Support diverting bad events from pipeline -- Key: FLUME-2140 URL: https://issues.apache.org/jira/browse/FLUME-2140 Project: Flume Issue Type: New Feature Components: Node Reporter: Arvind Prabhakar A *bad event* can be any event that causes persistent sink side processing failure due to the inherent nature of the event itself. Note that failures that are not related to the inherent nature of the event such as network communication failure, downstream capacity failure etc., do not make the event a bad-event. The presence of a bad event in a channel can cause the entire pipleline to choke and become unusable. Flume should therefore be able to identify bad events and provide a facility to route them out of the pipleline in order to ensure the transport of other events continues uninterrupted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-2140) Support diverting bad events from pipeline
[ https://issues.apache.org/jira/browse/FLUME-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727912#comment-13727912 ] Arvind Prabhakar commented on FLUME-2140: - Another case - a downstream filter is buggy and causes a batch to fail repeatedly due to a malformed header or some other details. Support diverting bad events from pipeline -- Key: FLUME-2140 URL: https://issues.apache.org/jira/browse/FLUME-2140 Project: Flume Issue Type: New Feature Components: Node Reporter: Arvind Prabhakar A *bad event* can be any event that causes persistent sink side processing failure due to the inherent nature of the event itself. Note that failures that are not related to the inherent nature of the event such as network communication failure, downstream capacity failure etc., do not make the event a bad-event. The presence of a bad event in a channel can cause the entire pipleline to choke and become unusable. Flume should therefore be able to identify bad events and provide a facility to route them out of the pipleline in order to ensure the transport of other events continues uninterrupted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release Apache Flume version 1.4.0 RC1
Given that there is ambiguity in terms of which license applies, and given that one of these licenses is Apache Software License 2.0, my suggestion is to keep BSD on record for our release. That way, we cover the more restrictive case and ideally should not pose any problems. Regards, Arvind Prabhakar On Fri, Jun 28, 2013 at 4:43 PM, Mike Percy mpe...@apache.org wrote: Interesting find, Hari. These guys are really a licensing disaster. However I believe Maven is wrong since the LICENSE file in their repository contains this: https://code.google.com/p/findbugs/source/browse/branches/1.3.9/findbugs/LICENSE-jsr305.txt -- The JSR-305 reference implementation (lib/jsr305.jar) is distributed under the terms of the New BSD license: http://www.opensource.org/licenses/bsd-license.php See the JSR-305 home page for more information: http://code.google.com/p/jsr-305/ -- So I think it really is BSD. Thoughts? Thanks, Mike On Fri, Jun 28, 2013 at 4:24 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: Hi, Looks like jsr305 is actually ASL2.0 (according to the mvn central pom for the specific version: http://search.maven.org/#artifactdetails%7Ccom.google.code.findbugs%7Cjsr305%7C1.3.9%7Cjar ). The pom installed locally also has this: licenses license nameThe Apache Software License, Version 2.0/name url http://www.apache.org/licenses/LICENSE-2.0.txt/url distributionrepo/distribution /license /licenses The webpage on the other hand says it is BSD licensed. Maybe we should verify this? I know the last few of our releases went out with BSD in the Licenses file. Thanks, Hari On Friday, June 28, 2013 at 1:37 PM, Jarek Jarcec Cecho wrote: +1 * Checked license file * Run tests * Checked other top level files * Checked checksums and signature Jarcec On Mon, Jun 24, 2013 at 07:30:18PM -0700, Mike Percy wrote: This is the fourth release for Apache Flume as a top-level project, version 1.4.0. We are voting on release candidate RC1. It fixes the following issues: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob_plain;f=CHANGELOG;hb=756924e96ace470289472a3bdb4d87e273ca74ef *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: http://people.apache.org/~mpercy/flume/apache-flume-1.4.0-RC1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-067/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=756924e96ace470289472a3bdb4d87e273ca74ef Flume's KEYS file containing PGP keys we use to sign the release is here: https://svn.apache.org/repos/asf/flume/dist/KEYS Thanks, Mike
Re: [VOTE] Release Apache Flume version 1.4.0 RC1
+1 * Built the sources * Verified checksums and signatures Thanks for the hard work Mike! Regards, Arvind Prabhakar On Mon, Jun 24, 2013 at 7:30 PM, Mike Percy mpe...@apache.org wrote: This is the fourth release for Apache Flume as a top-level project, version 1.4.0. We are voting on release candidate RC1. It fixes the following issues: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob_plain;f=CHANGELOG;hb=756924e96ace470289472a3bdb4d87e273ca74ef *** Please cast your vote within the next 72 hours *** The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found here: http://people.apache.org/~mpercy/flume/apache-flume-1.4.0-RC1/ Maven staging repo: https://repository.apache.org/content/repositories/orgapacheflume-067/ The tag to be voted on: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commit;h=756924e96ace470289472a3bdb4d87e273ca74ef Flume's KEYS file containing PGP keys we use to sign the release is here: https://svn.apache.org/repos/asf/flume/dist/KEYS Thanks, Mike
Re: [DISCUSS] Flume 1.4 release plan
Thanks for taking this initiative Mike! +1 for 1.4 and Mike as RM. Regards, Arvind Prabhakar On Wed, May 22, 2013 at 12:45 AM, Hari Shreedharan hshreedha...@cloudera.com wrote: +1 for Flume 1.4 +1 for Mike being RM. Cheers, Hari On Wednesday, May 22, 2013 at 12:33 AM, Mike Percy wrote: Hi folks, We have had over 100 commits since 1.3.1, and a bunch of new features and improvements including a Thrift source, much improved ElasticSearch sink, support for a new plugins directory and layout, compression support in the avro sink/source, improved checkpointing in the file channel and more, plus a lot of bug fixes. It seems to me that it's time to start thinking about cutting a 1.4 release. I would be happy to volunteer to RM the release. Worth noting that I will be unavailable for the next two weeks... but after that I'd be happy to pick this up and run with it. That's also a decent amount of time for people to get moving on patches and reviews for their favorite features, bug fixes, etc. If this all sounds OK, I'd like to suggest targeting the last week of June as a release date. If we can release in time for Hadoop Summit then that would be pretty nice. Otherwise, if something comes up and we can't get the release out that week, let's shoot for the first week of July at the latest. Please let me know your thoughts. Regards, Mike
Re: Flume schedule
Hi Aline, Currently there is no discussion around the timing for Flume 1.4.0. Could you share your motivation behind asking for the release schedule? Regards, Arvind Prabhakar On Mon, Feb 4, 2013 at 9:10 AM, Aline Guedes aline...@linux.vnet.ibm.comwrote: Hello, Is there a schedule for Flume available somewhere? I am interested in the planned release date for Flume 1.4.0 (in case there is a planned date), but I can't find it anywhere. Thanks! Aline
Re: [VOTE] Release Apache Flume 1.3.1
+1 * Verified signatures and hash sums * Build and tests work fine * Top level files look good. Thanks for driving this Hari. Regards, Arvind Prabhakar On Fri, Dec 21, 2012 at 11:44 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: Hi all, This is the third release for Apache Flume as a top-level project, version 1.3.1. We are voting on release candidate rc0. *** This vote will remain open for at least 72 hours *** The list of fixed issues: http://s.apache.org/01x The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found at: http://people.apache.org/~hshreedharan/apache-flume-1.3.1-rc0/ Nexus Staging Repository: https://repository.apache.org/content/repositories/orgapacheflume-074/ The tag we are voting on: http://s.apache.org/L8q The KEYS file can be found here: https://dist.apache.org/repos/dist/release/flume/KEYS Thanks, Hari Shreedharan -- Hari Shreedharan
Re: [ANNOUNCE] Apache Flume 1.3.0 released
Thanks for your hard work Brock! Appreciate your diligence and resolve in getting this through! Regards, Arvind Prabhakar On Tue, Dec 4, 2012 at 9:16 PM, Will McQueen w...@cloudera.com wrote: Great job Brock! And thank you to everyone who contributed! Cheers, Will On Tue, Dec 4, 2012 at 8:37 PM, Mike Percy mpe...@apache.org wrote: Hear, hear! Brock, well done sir, thanks for all your excellent hard work on this release! Regards, Mike On Tue, Dec 4, 2012 at 8:30 AM, Jarek Jarcec Cecho jar...@apache.org wrote: Thank you Brock for driving this release, you've done excellent job as a Release manager! Jarcec On Tue, Dec 04, 2012 at 10:13:58AM -0600, Brock Noland wrote: The Apache Flume team is pleased to announce the release of Flume version 1.3.0. Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. This release can be downloaded from the Flume download page at: http://flume.apache.org/download.html The change log and documentation are available on the 1.3.0 release page: http://flume.apache.org/releases/1.3.0.html Your help and feedback is more than welcome. For more information on how to report problems and to get involved, visit the project website at http://flume.apache.org/ The Apache Flume Team
Re: https://cwiki.apache.org/confluence/display/FLUME/Index
Hi Alex, I have granted you temporary administrator privileges for both the spaces. Please let me know as soon as you are done with the changes as I will have to revert the privileges back to normal. Regards, Arvind Prabhakar On Thu, Nov 15, 2012 at 10:40 PM, Alexander Alten-Lorenz wget.n...@gmail.com wrote: Hi Arvid, Can I please have Space Admin rights for Sqoop and Flume Confluence spaces to configure the index as well the favicon? Thanks, Alex Begin forwarded message: From: Alexander Alten-Lorenz wget.n...@gmail.com Subject: Re: https://cwiki.apache.org/confluence/display/FLUME/Index Date: November 15, 2012 10:11:53 AM GMT+01:00 To: dev@flume.apache.org I did some changes today, but I guess confluence has a bug: https://cwiki.apache.org/FLUME/index.html - the h1. line is missing, and the layout was bad. I did a ugly hack (adding 5 spaces as a own column). https://cwiki.apache.org/confluence/display/FLUME/Index - looks much better, but will not shown as the index.html I didn't figured out what the heck is going on there, I guess a bug in the html exporter (parser)? Anyway, looks now cleaner and I moved Mike's both article into the blog section and linked that together to get better search engines results. Thanks, Alex On Nov 15, 2012, at 3:21 AM, Brock Noland br...@cloudera.com wrote: Big +1 Thank you very much! On Wed, Nov 14, 2012 at 6:05 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: Excellent work! This was something I heard from many people - the wiki is the top result if you search for Flume docs, and it pointed to OG stuff. Thanks to you, now that is taken care of. Thanks a lot for this effort! Hari -- Hari Shreedharan On Wednesday, November 14, 2012 at 3:52 PM, Mike Percy wrote: Alex, this looks great! Thanks so much for spending the time to reorganize the Wiki. It is way more useful. Regards, Mike On Wed, Nov 14, 2012 at 6:27 AM, Alexander Alten-Lorenz wget.n...@gmail.com (mailto:wget.n...@gmail.com) wrote: Guys, I've spent most of my day today to reorganize the wiki, please have a look and ping me with all stuff you miss or what we should organize better. I was moving all the OG stuff into a new section, called Flume OG (pre 1.0), the same I did with all Flume NG stuff. Also I added a blogpost about flume's memory consumption and will add some from time time with topics we figured out in our mailing list. Of course, add own stuff too! I did some cosmetic changes too (include our logo as example). best, Alex -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF
Re: [ANNOUNCE] New Apache Flume committer - Patrick Wendell
Congratulations Patrick! Well deserved! Regards, Arvind Prabhakar On Mon, Nov 12, 2012 at 1:04 PM, Hari Shreedharan hshreedha...@cloudera.com wrote: On behalf of the Apache Flume PMC, I am excited to welcome Patrick Wendell as a committer on Flume! Patrick has contributed significantly to the project, by adding new features, fixing bugs and helping users on the Flume users list. Here is a list of jiras Patrick has worked on: http://s.apache.org/6EG Please join me in congratulating Patrick on his new role! Thanks, Hari
[jira] [Commented] (FLUME-1502) Support for running simple configurations embedded in host process
[ https://issues.apache.org/jira/browse/FLUME-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491105#comment-13491105 ] Arvind Prabhakar commented on FLUME-1502: - @Brock, thanks for the design document. On the point of File Channel, I do feel that it is important to have that support to ensure that we do not put excessive strain on memory for the host process, and that we do not lose events in the case of host process failure. Another point to consider is whether the source would be any different from a regular source when running in embedded mode. For example, does it make sense to have embedded agent with a network source like Avro working on it? For instance, it may make sense to have no source support, but a direct pass-through for the client API that directly talks with the channel in question. Support for running simple configurations embedded in host process -- Key: FLUME-1502 URL: https://issues.apache.org/jira/browse/FLUME-1502 Project: Flume Issue Type: Improvement Affects Versions: v1.2.0 Reporter: Arvind Prabhakar Assignee: Brock Noland Attachments: embeeded-agent-1.pdf Flume should provide a light-weight embeddable node manager that can be started in process where necessary. This will allow the users to embed light-weight agents within the host process where necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1573) Duplicated HDFS file name when multiple SinkRunner was existing
[ https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465681#comment-13465681 ] Arvind Prabhakar commented on FLUME-1573: - @Denny - a sink is an independent, isolated component of Flume. It cannot assume any knowledge of other sink(s) operating within the same agent. Having a synchronization requirement across multiple sinks breaks this invariant. However, if within the same sink there are problems due to collisions between different bucket writers, that would be a bug and merits fixing. From the explanation above that does not seem to be the case to me. Duplicated HDFS file name when multiple SinkRunner was existing --- Key: FLUME-1573 URL: https://issues.apache.org/jira/browse/FLUME-1573 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.2.0 Reporter: Denny Ye Assignee: Denny Ye Fix For: v1.3.0 Attachments: FLUME-1573.patch Multiple HDFS Sinks to write events into storage. Timeout exception is always happening: {code:xml} 11 Sep 2012 07:04:53,478 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:442) - HDFS IO error java.io.IOException: Callable timed out after 1 ms at org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342) at org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:619) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335) ... 5 more {code} I doubted that there might be happened HDFS timeout or slowly response. As expected, I found the duplicated creation exception with same with at HDFS. Also, Flume recorded same case for duplicated file name. {code:xml} 13 Sep 2012 02:09:35,432 INFO [hdfs-hdfsSink-3-call-runner-7] (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189) - Creating /FLUME/dt=2012-09-13/02-host.1347501924111.tmp 13 Sep 2012 02:09:36,425 INFO [hdfs-hdfsSink-4-call-runner-8] (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189) - Creating /FLUME/dt=2012-09-13/02-host.1347501924111.tmp {code} Different threads were going to create same file without time conflict. I found the root cause might be wrong usage the AtomicLong property named 'fileExtensionCounter' at BucketWriter. Different threads should own same counter by protected with CAS, not multiple private property in each thread. It's useless to avoid conflict of HDFS path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Flume builds back online
The flume builds were previously disabled due to repository change. Updating the configuration and restricting it to the nodes that have Git support seems to have worked: https://builds.apache.org/job/flume-trunk/281/ I also took the liberty to enabling email notifications but in order to minimize the overall mails generated reduced the frequency to daily instead of the previous hourly frequency. Regards, Arvind Prabhakar
[jira] [Commented] (FLUME-1424) File Channel should support encryption
[ https://issues.apache.org/jira/browse/FLUME-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429992#comment-13429992 ] Arvind Prabhakar commented on FLUME-1424: - Yes, the put records do store the data in them. We can perhaps start with that as a first step and if more requirements pop-up, we can address them in follow-up Jiras as necessary. File Channel should support encryption -- Key: FLUME-1424 URL: https://issues.apache.org/jira/browse/FLUME-1424 Project: Flume Issue Type: Bug Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar When persisting the data to disk, the File Channel should allow some form of encryption to ensure safety of data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1424) File Channel should support encryption
[ https://issues.apache.org/jira/browse/FLUME-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428457#comment-13428457 ] Arvind Prabhakar commented on FLUME-1424: - @Ralph - this is definitely one way to address this requirement. The advantage (and perhaps a disadvantage at the same time) of this approach is that it will only incorporate encryption for the put records. Another way to do this is to implement encryption at the LogFile.Writer/Reader level where the byte buffers are serialized between transaction boundaries. This approach will have a higher performance penalty but would encrypt every file channel record regardless of type. File Channel should support encryption -- Key: FLUME-1424 URL: https://issues.apache.org/jira/browse/FLUME-1424 Project: Flume Issue Type: Bug Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar When persisting the data to disk, the File Channel should allow some form of encryption to ensure safety of data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-1424) File Channel should support encryption
Arvind Prabhakar created FLUME-1424: --- Summary: File Channel should support encryption Key: FLUME-1424 URL: https://issues.apache.org/jira/browse/FLUME-1424 Project: Flume Issue Type: Bug Reporter: Arvind Prabhakar When persisting the data to disk, the File Channel should allow some form of encryption to ensure safety of data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-1380) File channel log can record the op code and not the operation in some cases
Arvind Prabhakar created FLUME-1380: --- Summary: File channel log can record the op code and not the operation in some cases Key: FLUME-1380 URL: https://issues.apache.org/jira/browse/FLUME-1380 Project: Flume Issue Type: Bug Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar There is a race condition in the system where the log file can record the beginning of a record and be shutdown before the remaining record is written out. This will lead to the system not starting up correctly again with exceptions like: {noformat} ERROR file.Log: Failed to initialize Log java.io.IOException: Header 80808080 not expected value: deadbeef {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1380) File channel log can record the op code and not the operation in some cases
[ https://issues.apache.org/jira/browse/FLUME-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated FLUME-1380: Attachment: FLUME-1380-1.patch File channel log can record the op code and not the operation in some cases --- Key: FLUME-1380 URL: https://issues.apache.org/jira/browse/FLUME-1380 Project: Flume Issue Type: Bug Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar Attachments: FLUME-1380-1.patch There is a race condition in the system where the log file can record the beginning of a record and be shutdown before the remaining record is written out. This will lead to the system not starting up correctly again with exceptions like: {noformat} ERROR file.Log: Failed to initialize Log java.io.IOException: Header 80808080 not expected value: deadbeef {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Intital version of the Flume web site.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5765/#review9130 --- Ship it! Ship It! - Arvind Prabhakar On July 8, 2012, 7:36 p.m., Ralph Goers wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5765/ --- (Updated July 8, 2012, 7:36 p.m.) Review request for Flume. Description --- This contains the source to build the initial version of the web site using the CMS, Maven and Sphinx. To test the build just run mvn site and the output of the site will be in target/site. mvn -P pdf package is supposed to package the users and developers guides as pdf's to be deployed to the site as part of a release but that isn't quite working yet. A few notes: 1. The site will be committed to https://svn.apache.org/repos/asf/flume/site/trunk. 2. The site is incomplete in that it is missing release information. This will be directly added once the site is published to the production svn location. This addresses bug FLUME-813. https://issues.apache.org/jira/browse/FLUME-813 Diffs - Diff: https://reviews.apache.org/r/5765/diff/ Testing --- Thanks, Ralph Goers
Re: [VOTE] Release Apache Flume version 1.2.0 (rc1)
+1 * Binary and Source distributions checksums and signatures match * LICENSE file accounts for all included Jars in the binary distribution * Sources build and test fine. * Top level files all look good * Jira is clean One slight concern (not a blocker): the tag contains sources in contrib that are not included in the source tar-ball. Since these sources are not used for build, we can do without those for now. Thanks for your hard work Mike! Regards, Arvind Prabhakar On Wed, Jul 11, 2012 at 4:57 AM, Mike Percy mpe...@apache.org wrote: This is the first release for Apache Flume as a top-level project, version 1.2.0. We are voting on release candidate rc1. *** Please cast your vote within the next 72 hours *** The list of fixed issues: https://svn.apache.org/repos/asf/flume/tags/flume-1.2.0-rc1/CHANGELOG The tarball (*.tar.gz), signature (*.asc), and checksums (*.md5, *.sha1) for the source and binary artifacts can be found at: https://people.apache.org/~mpercy/flume/apache-flume-1.2.0-rc1/ The tag to be voted on: https://svn.apache.org/repos/asf/flume/tags/flume-1.2.0-rc1 The KEYS file can be found here: https://svn.apache.org/repos/asf/flume/dist/KEYS Changes since rc0: - Updated LICENSE file - Updated DEVNOTES file - Removed DISCLAIMER file from dist.xml and src.xml manifests - pom.xml file updated with TLP info (FLUME-1359) - A build fix to prevent multiple servlet-api jars in lib dir
Re: [DISCUSS] Git as primary source control for Flume
+1 for using Git as primary source control system. Thanks Hari for following up on this. Regards, Arvind Prabhakar On Wed, Jul 11, 2012 at 7:16 PM, Leslin leslin...@gmail.com wrote: +1 for this proposal. Git is fine for me. I never back to SVN after I touched git. 2012/7/12 Mike Percy mpe...@apache.org On Wed, Jul 11, 2012 at 5:45 PM, Ralph Goers ralph.go...@dslextreme.com wrote: IMO the person who wrote the code is the one who should get credit. Of course they should get the credit for the work. Anyone who has ever performed a careful code review knows that it can be time-consuming work. I assume that's one reason why we currently list both the author and the committer in the commit message. Regards, Mike -- Best Regards Leslin
Re: New SVN location for flume
I just updated the authorization file for the new subversion repo. Can you please try and check if this resolves the issue? Regards, Arvind Prabhakar On Sun, Jul 8, 2012 at 11:14 AM, Jarek Jarcec Cecho jar...@apache.orgwrote: Thank you Hari for your feedback, I've filled INFRA-5022 to get it resolved. Jarcec https://issues.apache.org/jira/browse/INFRA-5022 On Sun, Jul 08, 2012 at 10:38:54AM -0700, Hari Shreedharan wrote: Same here. Permissions issue. -- Hari Shreedharan On Sunday, July 8, 2012 at 10:17 AM, Jarek Jarcec Cecho wrote: Hi guys, I've tried to commit FLUME-1348 today to our new SVN location on https://svn.apache.org/repos/asf/flume/. Unfortunately, I have failed on permission issue. Can anyone else try test commit just to see whether it's my local issue or it's also affecting anyone else? Jarcec