[jira] [Commented] (FLUME-2307) Remove Log writetimeout
[ https://issues.apache.org/jira/browse/FLUME-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202215#comment-14202215 ] Jeff Lord commented on FLUME-2307: -- Should we re-open this issue? Not sure if it is still occurring or what. > Remove Log writetimeout > --- > > Key: FLUME-2307 > URL: https://issues.apache.org/jira/browse/FLUME-2307 > Project: Flume > Issue Type: Bug > Components: Channel >Affects Versions: v1.4.0 >Reporter: Steve Zesch >Assignee: Hari Shreedharan > Fix For: v1.5.0 > > Attachments: FLUME-2307-1.patch, FLUME-2307.patch > > > I've observed Flume failing to clean up old log data in FileChannels. The > amount of old log data can range anywhere from tens to hundreds of GB. I was > able to confirm that the channels were in fact empty. This behavior always > occurs after lock timeouts when attempting to put, take, rollback, or commit > to a FileChannel. Once the timeout occurs, Flume stops cleaning up the old > files. I was able to confirm that the Log's writeCheckpoint method was still > being called and successfully obtaining a lock from tryLockExclusive(), but I > was not able to confirm removeOldLogs being called. The application log did > not include "Removing old file: log-xyz" for the old files which the Log > class would output if they were correctly being removed. I suspect the lock > timeouts were due to high I/O load at the time. > Some stack traces: > {code} > org.apache.flume.ChannelException: Failed to obtain lock for writing to the > log. Try increasing the log write timeout value. [channel=fileChannel] > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:478) > at > org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93) > at > org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80) > at > org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189) > org.apache.flume.ChannelException: Failed to obtain lock for writing to the > log. Try increasing the log write timeout value. [channel=fileChannel] > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:594) > at > org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151) > at > dataxu.flume.plugins.avro.AsyncAvroSink.process(AsyncAvroSink.java:548) > at > dataxu.flume.plugins.ClassLoaderFlumeSink.process(ClassLoaderFlumeSink.java:33) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:619) > org.apache.flume.ChannelException: Failed to obtain lock for writing to the > log. Try increasing the log write timeout value. [channel=fileChannel] > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:621) > at > org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168) > at > org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:194) > at > dataxu.flume.plugins.avro.AvroSource.appendBatch(AvroSource.java:209) > at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.avro.ipc.specific.SpecificResponder.respond(SpecificResponder.java:91) > at org.apache.avro.ipc.Responder.respond(Responder.java:151) > at > org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188) > at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:75) > at > org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173) > at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) > at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:792) > at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > at > org.jboss.netty.handler.codec.frame.Fram
Re: Flume error handling
Hi Sverre, Have you taken a look at the EventDrivenSourceRunner ? I think this *may* help. https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/source/EventDrivenSourceRunner.java -Jeff On Thu, Nov 6, 2014 at 4:44 AM, Sverre Bakke wrote: > Hi, > > When creating a new EventDrivenSource running as an executor, what is > the correct approach to handling shutdown gracefully? > > I am writing a custom source that will poll a compressed file line by > line using BufferedReader and pushing these lines to a > ChannelProcessor using processEvent(). This as a result of Spooling > Directory not supporting compressed files. This also means that most > of the time, my Flume source will be blocking on > BufferedReader.readLine() or blocking on > ChannelProcessor.processEvent(). > > If I shutdown the executor from the stop() method of my source, the > typical response from Flume will be that the ChannelProcessor will > generate a ChannelException. In what situations can I expect that the > ChannelException actually is the result of a shutdown (e.g. ctrl+c) > rather than some other issue that should be handled as a truly > exceptional situation/error? Or am I approaching graceful shutdown > completely wrong? > > Is there any specific order in which the Flume sources, interceptors > and sinks are signaled to shut down? > > I feel that when it comes to error handling (and shutdowns), the > developer guide and javadoc is a bit lacking unfortunately. > > Regards, > Sverre Bakke >
Re: [ANNOUNCE] New Flume PMC Member - Roshan Naik
Congrats Roshan On Tue, Nov 4, 2014 at 2:31 PM, Hari Shreedharan wrote: > Congrats Roshan! > > > Thanks, > Hari > > On Tue, Nov 4, 2014 at 2:12 PM, Arvind Prabhakar > wrote: > > > On behalf of Apache Flume PMC, it is my pleasure to announce that Roshan > > Naik has been elected to the Flume Project Management Committee. Roshan > has > > been active with the project for many years and has been a committer on > the > > project since September of 2013. > > Please join me in congratulating Roshan and welcoming him to the Flume > PMC. > > Regards, > > Arvind Prabhakar >
Re: Flume 0.X JIRA's
I think this sounds like a good idea. On Mon, Nov 3, 2014 at 1:44 AM, Ashish wrote: > Folks, > > Whats the plan for pre-FlumeNG jira's? To me the branch seems dead. > IMHO, we can mark the ticket's as Won't Fix and clean up JIRA. > > wdyt? > > thanks > ashish >
[jira] [Updated] (FLUME-1960) Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT serializer
[ https://issues.apache.org/jira/browse/FLUME-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1960: - Attachment: FLUME-1960-0.patch This is a simple doc patch so no review board necessary. > Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT > serializer > --- > > Key: FLUME-1960 > URL: https://issues.apache.org/jira/browse/FLUME-1960 > Project: Flume > Issue Type: Documentation > Components: Docs >Affects Versions: v1.3.1 >Reporter: Rob Johnson >Assignee: Jeff Lord >Priority: Minor > Labels: noob > Attachments: FLUME-1960-0.patch > > > A special serializer for HDFS Sink was added to Flume a while back, but it's > not documented. This serializer is useful when the source is any type of > syslog source. > Without specifying the serializer, the timestamp and host are not logged to > the file with the event information, which is pretty useless without the > timestamp and hosts. > The serializer can be configured on an hdfs sink like so: > agent1.sinks.k1.serializer=HEADER_AND_TEXT > Without this serializer specified you get (for example): > adclient[12112]: INFO daemon.main Start trusted domain > discovery > as an event. > When you specify the serializer, the same event looks like this: > {timestamp=1364380838000, Severity=6, host=myhostname, Facility=4} > adclient[12112]: INFO daemon.main Start trusted domain > discovery > Which is much more useful. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: HDFS sink throughput
Are you getting any errors back in the flume log? On Wed, Mar 5, 2014 at 10:11 AM, Nikolaos Tsipas wrote: > Hello, > > We are using flume's HDFS sink to store log data in Amazon S3 and we are > facing some throughput issues. In our flume config we have an avro source, a > file channel and the hdfs sink. The file channel is configured on a > provisioned IOPS EBS volume and we are running on an m1.large EC2 instance > (flume 1.4.0, java 1.7.0). > > Below you will find an example metric from our s3-file-channel. The main > issue is that the "EventTakeSuccessCount" can't cope with the > "EventPutSuccessCount" and as a result our "ChannelSize" increases over time. > > We tried to use multiple hdfs-sinks but it didn't have any positive effect. > Strangely, the problem is still there even when a memory channel is used. > Another interesting fact is that we are also using an identical file-channel > with the elasticsearch-sink and under the same load we don't have any > throughput issues. > > We would appreciate any suggestions that could help us improve the > performance of the hdfs sink. > > Regards, > Nick > > "CHANNEL.s3-file-channel": { > "ChannelCapacity": "1500", > "ChannelFillPercentage": "11.6603", > "ChannelSize": "1749045", > "EventPutAttemptCount": "938299", > "EventPutSuccessCount": "938181", > "EventTakeAttemptCount": "648801", > "EventTakeSuccessCount": "635000", > "StartTime": "1394038826288", > "StopTime": "0", > "Type": "CHANNEL" > > > > > > > http://www.bbc.co.uk > This e-mail (and any attachments) is confidential and may contain personal > views which are not the views of the BBC unless specifically stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in reliance > on it and notify the sender immediately. > Please note that the BBC monitors e-mails sent or received. > Further communication will signify your consent to this. > > -
[jira] [Created] (FLUME-2339) JMS Source configuration example causes a javax.naming.NameNotFoundException. Removing or commenting out the "connectionFactory" property fixes this
Jeff Lord created FLUME-2339: Summary: JMS Source configuration example causes a javax.naming.NameNotFoundException. Removing or commenting out the "connectionFactory" property fixes this Key: FLUME-2339 URL: https://issues.apache.org/jira/browse/FLUME-2339 Project: Flume Issue Type: Improvement Components: Docs Reporter: Jeff Lord Submitted on behalf of Richard Ross The example jms source in the docs lists a connectionFactory property. For an activemq source this will cause it to fail on startup as the connectionFactory is not used. We should update the docs to reflect this. The default JMS Source configuration example causes a javax.naming.NameNotFoundException. Removing or commenting out the "connectionFactory" property fixes this. Maybe this should be documented? Below is my full configuration example that reproduces the error: # example.conf: A single-node Flume configuration # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = jms a1.sources.r1.channels = c1 a1.sources.r1.initialContextFactory = org.apache.activemq.jndi.ActiveMQInitialContextFactory a1.sources.r1.connectionFactory = GenericConnectionFactory a1.sources.r1.providerURL = tcp://mqserver:61616 a1.sources.r1.destinationName = BUSINESS_DATA a1.sources.r1.destinationType = QUEUE # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Event header validation using interceptors
Wolfgang, Will the morphline interceptor + grok actually match event headers or just the event body? -Jeff On Thu, Feb 13, 2014 at 10:05 AM, Wolfgang Hoschek wrote: > You could probably do this with a MorphlineInterceptor, e.g. via using the > grok command in combination with the tryCatch command. > > http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor > http://kitesdk.org/docs/current/kite-morphlines/index.html > http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#grok > http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#tryRules > > Wolfgang. > > On Feb 13, 2014, at 7:58 PM, Nikolaos Tsipas wrote: > >> Hello, >> >> We have a use case that requires the validation of headers on events >> received by an avro source in order to consider an event as valid or >> invalid. If an event is invalid then it should be routed to a different >> channel. >> >> We know how to route events based on the values of specific headers using >> multiplexing. However, for the regex validation of headers flume doesn't >> seem to provide any appropriate interceptors. >> >> For this reason, we are thinking to create a new interceptor that would >> allow regex validation of headers and depending on the outcome a header >> would be added (e.g. valid = true) >> >> Questions: >> >> * Does the above sound like a reasonable solution for what we want to >> achieve? >> * What would be the best way to implement it in order to be beneficial for >> the flume community? Extend the functionality of one of the existing >> interceptors (e.g. RegexFilteringInterceptor) or provide a new one? >> >> Regards, >> Nikolaos >> >> >> >> >> >> http://www.bbc.co.uk >> This e-mail (and any attachments) is confidential and may contain personal >> views which are not the views of the BBC unless specifically stated. >> If you have received it in error, please delete it from your system. >> Do not use, copy or disclose the information in any way nor act in reliance >> on it and notify the sender immediately. >> Please note that the BBC monitors e-mails sent or received. >> Further communication will signify your consent to this. >> >> - >
Re: [DISCUSS] Release Flume 1.5.0
+1 for a release +1 for resuming the contrib discussion On Thu, Jan 30, 2014 at 12:27 PM, Wolfgang Hoschek wrote: > +1 There a many important new features and fixes ready to go. > > Wolfgang. > > On Jan 30, 2014, at 7:43 PM, Chiwan Park wrote: > > > +1 on new release! > > > > -- > > Regards, > > Chiwan Park > > > > On Jan 31, 2014, at 2:17 AM, Hari Shreedharan > wrote: > > > >> Hi folks, > >> > >> It has been about 6 months since we did a release. We have added several > >> new features and fixed a lot of bugs. What do you guys think about > >> releasing Flume 1.5.0? > >> > >> > >> Thanks > >> Hari > > > >
[jira] [Assigned] (FLUME-2291) Compressed Sequence files should not use default extension for codecs that are not splittable
[ https://issues.apache.org/jira/browse/FLUME-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-2291: Assignee: Jeff Lord > Compressed Sequence files should not use default extension for codecs that > are not splittable > - > > Key: FLUME-2291 > URL: https://issues.apache.org/jira/browse/FLUME-2291 > Project: Flume > Issue Type: Bug >Reporter: Hari Shreedharan > Assignee: Jeff Lord > > If snappy is used to compress writes to a sequence file, they should still be > splittable, even though snappy itself is not. Currently we write such files > out using a ".snappy" extension by default, which causes MR to think that the > file is not splittable. We should change the default file extension for > sequence files -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (FLUME-2279) HDFSSequenceFile doesn't support custom serializers
[ https://issues.apache.org/jira/browse/FLUME-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853571#comment-13853571 ] Jeff Lord commented on FLUME-2279: -- If you wish to write seq files to hdfs than you have the option to use Text or Writable (default) hdfs.writeFormat = Writeable Is that what you are after? > HDFSSequenceFile doesn't support custom serializers > --- > > Key: FLUME-2279 > URL: https://issues.apache.org/jira/browse/FLUME-2279 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Reporter: Harpreet Singh > > The HDFSEventSink has a serializer parameter that can be specified in the > config. However, if the fileType is set to sequence file (HDFSSequenceFile), > the serializer is not used. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (FLUME-2056) Allow SpoolDir to pass just the filename that is the source of an event
[ https://issues.apache.org/jira/browse/FLUME-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2056: - Attachment: FLUME-2056-1.patch Revised unit test. while (source.getSourceCounter().getEventAcceptedCount() < 8) { Thread.sleep(10); } > Allow SpoolDir to pass just the filename that is the source of an event > --- > > Key: FLUME-2056 > URL: https://issues.apache.org/jira/browse/FLUME-2056 > Project: Flume > Issue Type: New Feature >Reporter: Jeff Lord > Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-2056-1.patch, FLUME-2056.0.patch > > > Currently we allow for passing of the absolute path. > It would be nice to just pass the filename in the event headers and allow for > using that on the hdfs sink. > if (annotateFileName) { > String filename = currentFile.get().getFile().getAbsolutePath(); > for (Event event : events) { > event.getHeaders().put(fileNameHeader, filename); > } > } -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Assigned] (FLUME-2270) Twitter Source Documentation Does not load properly
[ https://issues.apache.org/jira/browse/FLUME-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-2270: Assignee: Jeff Lord > Twitter Source Documentation Does not load properly > --- > > Key: FLUME-2270 > URL: https://issues.apache.org/jira/browse/FLUME-2270 > Project: Flume > Issue Type: Bug > Components: Docs >Affects Versions: v1.5.0 > Reporter: Jeff Lord >Assignee: Jeff Lord >Priority: Trivial > Attachments: FLUME-2270.patch > > > Experimental twitter source documentation does not display properly. > There are a couple of columns that are missing from the display. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (FLUME-2270) Twitter Source Documentation Does not load properly
[ https://issues.apache.org/jira/browse/FLUME-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2270: - Attachment: FLUME-2270.patch Skipping reviewboard as this is a trivial doc patch > Twitter Source Documentation Does not load properly > --- > > Key: FLUME-2270 > URL: https://issues.apache.org/jira/browse/FLUME-2270 > Project: Flume > Issue Type: Bug > Components: Docs >Affects Versions: v1.5.0 > Reporter: Jeff Lord >Priority: Trivial > Attachments: FLUME-2270.patch > > > Experimental twitter source documentation does not display properly. > There are a couple of columns that are missing from the display. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (FLUME-2270) Twitter Source Documentation Does not load properly
Jeff Lord created FLUME-2270: Summary: Twitter Source Documentation Does not load properly Key: FLUME-2270 URL: https://issues.apache.org/jira/browse/FLUME-2270 Project: Flume Issue Type: Bug Components: Docs Affects Versions: v1.5.0 Reporter: Jeff Lord Priority: Trivial Experimental twitter source documentation does not display properly. There are a couple of columns that are missing from the display. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (FLUME-2217) Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and Udp sources
[ https://issues.apache.org/jira/browse/FLUME-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2217: - Attachment: FLUME-2217.6.patch > Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and > Udp sources > -- > > Key: FLUME-2217 > URL: https://issues.apache.org/jira/browse/FLUME-2217 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 >Reporter: Jeff Lord >Assignee: Jeff Lord > Attachments: FLUME-2217.1.patch, FLUME-2217.2.patch, > FLUME-2217.3.patch, FLUME-2217.6.patch > > > Flume-1666 added the ability to preserve timestamp and hostname fields of a > syslog message. We should also add this property to the MultiportSyslogTcp > Source and the SyslogUdp sources. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize
[ https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2006: - Assignee: Ashish Paliwal (was: Jeff Lord) > in Avro the batch size is called batch-size, in all other sources batchSize > --- > > Key: FLUME-2006 > URL: https://issues.apache.org/jira/browse/FLUME-2006 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.1 >Reporter: Alexander Alten-Lorenz >Assignee: Ashish Paliwal >Priority: Trivial > > http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e > The mismatch is with Avro Sink as well as Thrift sink. Other sinks use > batchSize as param name -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (FLUME-2056) Allow SpoolDir to pass just the filename that is the source of an event
[ https://issues.apache.org/jira/browse/FLUME-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2056: - Attachment: FLUME-2056.0.patch Attached is a patch which adds this functionality to put the basename of a file in the event headers. Reviewboard link to follow shortly. > Allow SpoolDir to pass just the filename that is the source of an event > --- > > Key: FLUME-2056 > URL: https://issues.apache.org/jira/browse/FLUME-2056 > Project: Flume > Issue Type: New Feature > Reporter: Jeff Lord >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-2056.0.patch > > > Currently we allow for passing of the absolute path. > It would be nice to just pass the filename in the event headers and allow for > using that on the hdfs sink. > if (annotateFileName) { > String filename = currentFile.get().getFile().getAbsolutePath(); > for (Event event : events) { > event.getHeaders().put(fileNameHeader, filename); > } > } -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (FLUME-2097) Add docs for Avro Serializer and describe how to make the Avro Deserializer work with the Avro Serializer
[ https://issues.apache.org/jira/browse/FLUME-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-2097: Assignee: Jeff Lord > Add docs for Avro Serializer and describe how to make the Avro Deserializer > work with the Avro Serializer > - > > Key: FLUME-2097 > URL: https://issues.apache.org/jira/browse/FLUME-2097 > Project: Flume > Issue Type: Documentation >Reporter: Mike Percy > Assignee: Jeff Lord >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-2217) Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and Udp sources
[ https://issues.apache.org/jira/browse/FLUME-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2217: - Attachment: FLUME-2217.3.patch Removed baosRaw and extra byte array. Modified the initial parsing to include Priority e.g. <10> Modified the regex to account for this and we should be good now. Was able to build successfully. Please let me know if there is anything else I can do to get this committed. > Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and > Udp sources > -- > > Key: FLUME-2217 > URL: https://issues.apache.org/jira/browse/FLUME-2217 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 >Reporter: Jeff Lord >Assignee: Jeff Lord > Attachments: FLUME-2217.1.patch, FLUME-2217.2.patch, > FLUME-2217.3.patch > > > Flume-1666 added the ability to preserve timestamp and hostname fields of a > syslog message. We should also add this property to the MultiportSyslogTcp > Source and the SyslogUdp sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-2217) Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and Udp sources
[ https://issues.apache.org/jira/browse/FLUME-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2217: - Attachment: FLUME-2217.2.patch Rev 2 based on feedbak from [~mpercy] on reviewboard. > Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and > Udp sources > -- > > Key: FLUME-2217 > URL: https://issues.apache.org/jira/browse/FLUME-2217 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 >Reporter: Jeff Lord >Assignee: Jeff Lord > Attachments: FLUME-2217.1.patch, FLUME-2217.2.patch > > > Flume-1666 added the ability to preserve timestamp and hostname fields of a > syslog message. We should also add this property to the MultiportSyslogTcp > Source and the SyslogUdp sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (FLUME-2056) Allow SpoolDir to pass just the filename that is the source of an event
[ https://issues.apache.org/jira/browse/FLUME-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-2056: Assignee: Jeff Lord > Allow SpoolDir to pass just the filename that is the source of an event > --- > > Key: FLUME-2056 > URL: https://issues.apache.org/jira/browse/FLUME-2056 > Project: Flume > Issue Type: New Feature > Reporter: Jeff Lord >Assignee: Jeff Lord >Priority: Minor > > Currently we allow for passing of the absolute path. > It would be nice to just pass the filename in the event headers and allow for > using that on the hdfs sink. > if (annotateFileName) { > String filename = currentFile.get().getFile().getAbsolutePath(); > for (Event event : events) { > event.getHeaders().put(fileNameHeader, filename); > } > } -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-2217) Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and Udp sources
[ https://issues.apache.org/jira/browse/FLUME-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2217: - Summary: Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and Udp sources (was: Preserve timestamp and hostname fields in MultiportSyslogTcp and Udp sources) > Preserve priority, timestamp and hostname fields in MultiportSyslogTcp and > Udp sources > -- > > Key: FLUME-2217 > URL: https://issues.apache.org/jira/browse/FLUME-2217 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 >Reporter: Jeff Lord >Assignee: Jeff Lord > Attachments: FLUME-2217.1.patch > > > Flume-1666 added the ability to preserve timestamp and hostname fields of a > syslog message. We should also add this property to the MultiportSyslogTcp > Source and the SyslogUdp sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-2217) Preserve timestamp and hostname fields in MultiportSyslogTcp and Udp sources
[ https://issues.apache.org/jira/browse/FLUME-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2217: - Attachment: FLUME-2217.1.patch Here is a first pass at this. Please not that the functionality of all 3 sources was modified slightly such that with this patch we will now preserve the syslog priority as well as the timestamp and header. e.g. <10>2013-10-31T17:36:27.381-07:00 localhost.localdomain test UDP syslog data > Preserve timestamp and hostname fields in MultiportSyslogTcp and Udp sources > > > Key: FLUME-2217 > URL: https://issues.apache.org/jira/browse/FLUME-2217 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 >Reporter: Jeff Lord >Assignee: Jeff Lord > Attachments: FLUME-2217.1.patch > > > Flume-1666 added the ability to preserve timestamp and hostname fields of a > syslog message. We should also add this property to the MultiportSyslogTcp > Source and the SyslogUdp sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2128) HDFS Sink rollSize is calculated based off of uncompressed size of cumulative events.
[ https://issues.apache.org/jira/browse/FLUME-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803483#comment-13803483 ] Jeff Lord commented on FLUME-2128: -- How are things going with the rebase Ted? > HDFS Sink rollSize is calculated based off of uncompressed size of cumulative > events. > - > > Key: FLUME-2128 > URL: https://issues.apache.org/jira/browse/FLUME-2128 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0, v1.3.1 >Reporter: Jeff Lord >Assignee: Ted Malaska > Labels: features > Attachments: FLUME-2128-0.patch, FLUME-2128-1.patch > > > The hdfs sink rollSize parameter is compared against uncompressed event sizes. > The net of this is that if you are using compression and expect the size of > your files on HDFS to be rolled/sized based on the value set for rollSize > than your files will be much smaller due to compression. > We should take into account when compression is set and roll based on the > compressed size on hdfs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (FLUME-2217) Preserve timestamp and hostname fields in MultiportSyslogTcp and Udp sources
Jeff Lord created FLUME-2217: Summary: Preserve timestamp and hostname fields in MultiportSyslogTcp and Udp sources Key: FLUME-2217 URL: https://issues.apache.org/jira/browse/FLUME-2217 Project: Flume Issue Type: Improvement Components: Sinks+Sources Affects Versions: v1.5.0 Reporter: Jeff Lord Assignee: Jeff Lord Flume-1666 added the ability to preserve timestamp and hostname fields of a syslog message. We should also add this property to the MultiportSyslogTcp Source and the SyslogUdp sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-1960) Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT serializer
[ https://issues.apache.org/jira/browse/FLUME-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797602#comment-13797602 ] Jeff Lord commented on FLUME-1960: -- [~roshan_naik] Thank you for the clarification. Title changed. > Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT > serializer > --- > > Key: FLUME-1960 > URL: https://issues.apache.org/jira/browse/FLUME-1960 > Project: Flume > Issue Type: Documentation > Components: Docs >Affects Versions: v1.3.1 >Reporter: Rob Johnson >Assignee: Jeff Lord >Priority: Minor > Labels: noob > > A special serializer for HDFS Sink was added to Flume a while back, but it's > not documented. This serializer is useful when the source is any type of > syslog source. > Without specifying the serializer, the timestamp and host are not logged to > the file with the event information, which is pretty useless without the > timestamp and hosts. > The serializer can be configured on an hdfs sink like so: > agent1.sinks.k1.serializer=HEADER_AND_TEXT > Without this serializer specified you get (for example): > adclient[12112]: INFO daemon.main Start trusted domain > discovery > as an event. > When you specify the serializer, the same event looks like this: > {timestamp=1364380838000, Severity=6, host=myhostname, Facility=4} > adclient[12112]: INFO daemon.main Start trusted domain > discovery > Which is much more useful. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (FLUME-1960) Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT serializer
[ https://issues.apache.org/jira/browse/FLUME-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1960: Assignee: Jeff Lord > Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT > serializer > --- > > Key: FLUME-1960 > URL: https://issues.apache.org/jira/browse/FLUME-1960 > Project: Flume > Issue Type: Documentation > Components: Docs >Affects Versions: v1.3.1 >Reporter: Rob Johnson >Assignee: Jeff Lord >Priority: Minor > Labels: noob > > A special serializer for HDFS Sink was added to Flume a while back, but it's > not documented. This serializer is useful when the source is any type of > syslog source. > Without specifying the serializer, the timestamp and host are not logged to > the file with the event information, which is pretty useless without the > timestamp and hosts. > The serializer can be configured on an hdfs sink like so: > agent1.sinks.k1.serializer=HEADER_AND_TEXT > Without this serializer specified you get (for example): > adclient[12112]: INFO daemon.main Start trusted domain > discovery > as an event. > When you specify the serializer, the same event looks like this: > {timestamp=1364380838000, Severity=6, host=myhostname, Facility=4} > adclient[12112]: INFO daemon.main Start trusted domain > discovery > Which is much more useful. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-1960) Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT serializer
[ https://issues.apache.org/jira/browse/FLUME-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1960: - Summary: Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT serializer (was: Flume syslog sink missing serializer documentation) > Flume hdfs sink missing serializer documentation for HEADER_AND_TEXT > serializer > --- > > Key: FLUME-1960 > URL: https://issues.apache.org/jira/browse/FLUME-1960 > Project: Flume > Issue Type: Documentation > Components: Docs >Affects Versions: v1.3.1 >Reporter: Rob Johnson >Priority: Minor > Labels: noob > > A special serializer for HDFS Sink was added to Flume a while back, but it's > not documented. This serializer is useful when the source is any type of > syslog source. > Without specifying the serializer, the timestamp and host are not logged to > the file with the event information, which is pretty useless without the > timestamp and hosts. > The serializer can be configured on an hdfs sink like so: > agent1.sinks.k1.serializer=HEADER_AND_TEXT > Without this serializer specified you get (for example): > adclient[12112]: INFO daemon.main Start trusted domain > discovery > as an event. > When you specify the serializer, the same event looks like this: > {timestamp=1364380838000, Severity=6, host=myhostname, Facility=4} > adclient[12112]: INFO daemon.main Start trusted domain > discovery > Which is much more useful. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-1960) Flume syslog sink missing serializer documentation
[ https://issues.apache.org/jira/browse/FLUME-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795333#comment-13795333 ] Jeff Lord commented on FLUME-1960: -- Is this still an issue now that FLUME-1666 has been resolved? I don't think it is. If its not lets go ahead and close out. > Flume syslog sink missing serializer documentation > -- > > Key: FLUME-1960 > URL: https://issues.apache.org/jira/browse/FLUME-1960 > Project: Flume > Issue Type: Documentation > Components: Docs >Affects Versions: v1.3.1 >Reporter: Rob Johnson >Priority: Minor > Labels: noob > > A special serializer for HDFS Sink was added to Flume a while back, but it's > not documented. This serializer is useful when the source is any type of > syslog source. > Without specifying the serializer, the timestamp and host are not logged to > the file with the event information, which is pretty useless without the > timestamp and hosts. > The serializer can be configured on an hdfs sink like so: > agent1.sinks.k1.serializer=HEADER_AND_TEXT > Without this serializer specified you get (for example): > adclient[12112]: INFO daemon.main Start trusted domain > discovery > as an event. > When you specify the serializer, the same event looks like this: > {timestamp=1364380838000, Severity=6, host=myhostname, Facility=4} > adclient[12112]: INFO daemon.main Start trusted domain > discovery > Which is much more useful. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2120) Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource
[ https://issues.apache.org/jira/browse/FLUME-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791696#comment-13791696 ] Jeff Lord commented on FLUME-2120: -- [~venkyz] Can you please attach the rb link to the jira here? > Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource > -- > > Key: FLUME-2120 > URL: https://issues.apache.org/jira/browse/FLUME-2120 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Affects Versions: v1.4.0 >Reporter: Venkatesh Sivasubramanian >Assignee: Venkatesh Sivasubramanian > Fix For: v1.4.1 > > Attachments: FLUME-2120.patch > > > Need ability to track the number of events received and accepted for the > SyslogUDPSource and SyslogTCPSource. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-1666) Syslog source strips timestamp and hostname from log message body
[ https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1666: - Attachment: FLUME-1666-4.patch > Syslog source strips timestamp and hostname from log message body > - > > Key: FLUME-1666 > URL: https://issues.apache.org/jira/browse/FLUME-1666 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.2.0, v1.3.0 > Environment: This occurs with Flume all the way up through 1.3.0. >Reporter: Josh West >Assignee: Jeff Lord > Attachments: FLUME-1666-1.patch, FLUME-1666-2.patch, > FLUME-1666-3.patch, FLUME-1666-4.patch, FLUME-1666-SyslogTextSerializer.patch > > > The syslog source parses incoming syslog messages. In the process, it strips > the timestamp and hostname from each log message, and places them as Event > headers. > Thus, a syslog message that would normally look like so (when written via > rsyslog or syslogd): > {noformat} > Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD > (/usr/local/sbin/somescript) > {noformat} > Appears in flume output as: > {noformat} > /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript) > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-1666) Syslog source strips timestamp and hostname from log message body
[ https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1666: - Attachment: FLUME-1666-3.patch > Syslog source strips timestamp and hostname from log message body > - > > Key: FLUME-1666 > URL: https://issues.apache.org/jira/browse/FLUME-1666 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.2.0, v1.3.0 > Environment: This occurs with Flume all the way up through 1.3.0. >Reporter: Josh West >Assignee: Jeff Lord > Attachments: FLUME-1666-1.patch, FLUME-1666-2.patch, > FLUME-1666-3.patch, FLUME-1666-SyslogTextSerializer.patch > > > The syslog source parses incoming syslog messages. In the process, it strips > the timestamp and hostname from each log message, and places them as Event > headers. > Thus, a syslog message that would normally look like so (when written via > rsyslog or syslogd): > {noformat} > Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD > (/usr/local/sbin/somescript) > {noformat} > Appears in flume output as: > {noformat} > /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript) > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2120) Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource
[ https://issues.apache.org/jira/browse/FLUME-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790988#comment-13790988 ] Jeff Lord commented on FLUME-2120: -- +1 Ok. I am on 1.8.4 Reviewed the patch and it looks good to me. One thing missing is the tcpsyslog source test though. I have added a test for that source on FLUME-1666 so will just need to add in the source counter test there as well. > Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource > -- > > Key: FLUME-2120 > URL: https://issues.apache.org/jira/browse/FLUME-2120 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Affects Versions: v1.4.0 >Reporter: Venkatesh Sivasubramanian >Assignee: Venkatesh Sivasubramanian > Fix For: v1.4.1 > > Attachments: FLUME-2120.patch > > > Need ability to track the number of events received and accepted for the > SyslogUDPSource and SyslogTCPSource. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2120) Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource
[ https://issues.apache.org/jira/browse/FLUME-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790601#comment-13790601 ] Jeff Lord commented on FLUME-2120: -- Venkatesh, Did you create your patch with the following command? git diff HEAD > FLUME-2120.x.patch This is what i use and it formats properly for rb. -J > Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource > -- > > Key: FLUME-2120 > URL: https://issues.apache.org/jira/browse/FLUME-2120 > Project: Flume > Issue Type: New Feature > Components: Sinks+Sources >Affects Versions: v1.4.0 >Reporter: Venkatesh Sivasubramanian >Assignee: Venkatesh Sivasubramanian > Fix For: v1.4.1 > > Attachments: FLUME-2120.patch > > > Need ability to track the number of events received and accepted for the > SyslogUDPSource and SyslogTCPSource. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (FLUME-2205) Syslog TCP Source does not provide source metrics
[ https://issues.apache.org/jira/browse/FLUME-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord resolved FLUME-2205. -- Resolution: Duplicate > Syslog TCP Source does not provide source metrics > - > > Key: FLUME-2205 > URL: https://issues.apache.org/jira/browse/FLUME-2205 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 > Reporter: Jeff Lord >Assignee: Jeff Lord > > The syslog TCP source does not implement a SourceCounter object and > subsequently does not output metrics. > {"CHANNEL.memoryChannel":{"EventPutSuccessCount":"0","ChannelFillPercentage":"0.0","Type":"CHANNEL","StopTime":"0","EventPutAttemptCount":"0","ChannelSize":"0","StartTime":"1381195210953","EventTakeSuccessCount":"0","ChannelCapacity":"100","EventTakeAttemptCount":"2"}} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2120) Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource
[ https://issues.apache.org/jira/browse/FLUME-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13789324#comment-13789324 ] Jeff Lord commented on FLUME-2120: -- Hi Venkatesh, Thank you for the patch and letting me know that FLUME-2205 is a duplicate of this issue. Are you able to create a review board link for this patch? Thank You, Jeff > Capture Metrics to Monitor SyslogUDPSource and SyslogTCPSource > -- > > Key: FLUME-2120 > URL: https://issues.apache.org/jira/browse/FLUME-2120 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0 >Reporter: Venkatesh Sivasubramanian >Assignee: Venkatesh Sivasubramanian > Fix For: v1.4.1 > > Attachments: FLUME-2120.patch > > > Need ability to track the number of events received and accepted for the > SyslogUDPSource and SyslogTCPSource. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-2205) Syslog TCP Source does not provide source metrics
[ https://issues.apache.org/jira/browse/FLUME-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2205: - Issue Type: Improvement (was: Bug) > Syslog TCP Source does not provide source metrics > - > > Key: FLUME-2205 > URL: https://issues.apache.org/jira/browse/FLUME-2205 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources >Affects Versions: v1.5.0 > Reporter: Jeff Lord >Assignee: Jeff Lord > > The syslog TCP source does not implement a SourceCounter object and > subsequently does not output metrics. > {"CHANNEL.memoryChannel":{"EventPutSuccessCount":"0","ChannelFillPercentage":"0.0","Type":"CHANNEL","StopTime":"0","EventPutAttemptCount":"0","ChannelSize":"0","StartTime":"1381195210953","EventTakeSuccessCount":"0","ChannelCapacity":"100","EventTakeAttemptCount":"2"}} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (FLUME-2205) Syslog TCP Source does not provide source metrics
Jeff Lord created FLUME-2205: Summary: Syslog TCP Source does not provide source metrics Key: FLUME-2205 URL: https://issues.apache.org/jira/browse/FLUME-2205 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.5.0 Reporter: Jeff Lord Assignee: Jeff Lord The syslog TCP source does not implement a SourceCounter object and subsequently does not output metrics. {"CHANNEL.memoryChannel":{"EventPutSuccessCount":"0","ChannelFillPercentage":"0.0","Type":"CHANNEL","StopTime":"0","EventPutAttemptCount":"0","ChannelSize":"0","StartTime":"1381195210953","EventTakeSuccessCount":"0","ChannelCapacity":"100","EventTakeAttemptCount":"2"}} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-1666) Syslog source strips timestamp and hostname from log message body
[ https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1666: - Attachment: FLUME-1666-2.patch Review Feedback Incorporated. New Patch Attached. Thanks for the review Mike! > Syslog source strips timestamp and hostname from log message body > - > > Key: FLUME-1666 > URL: https://issues.apache.org/jira/browse/FLUME-1666 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.2.0, v1.3.0 > Environment: This occurs with Flume all the way up through 1.3.0. >Reporter: Josh West >Assignee: Jeff Lord > Attachments: FLUME-1666-1.patch, FLUME-1666-2.patch, > FLUME-1666-SyslogTextSerializer.patch > > > The syslog source parses incoming syslog messages. In the process, it strips > the timestamp and hostname from each log message, and places them as Event > headers. > Thus, a syslog message that would normally look like so (when written via > rsyslog or syslogd): > {noformat} > Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD > (/usr/local/sbin/somescript) > {noformat} > Appears in flume output as: > {noformat} > /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript) > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (FLUME-1666) Syslog source strips timestamp and hostname from log message body
[ https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1666: - Attachment: FLUME-1666-1.patch Attaching a patch which introduces a boolean keepFields which defaults to false. When set to true this will preserve the timestamp and hostname in the body of the event. Additionally I have added a test for SyslogTcpSource > Syslog source strips timestamp and hostname from log message body > - > > Key: FLUME-1666 > URL: https://issues.apache.org/jira/browse/FLUME-1666 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.2.0, v1.3.0 > Environment: This occurs with Flume all the way up through 1.3.0. >Reporter: Josh West >Assignee: Jeff Lord > Attachments: FLUME-1666-1.patch, FLUME-1666-SyslogTextSerializer.patch > > > The syslog source parses incoming syslog messages. In the process, it strips > the timestamp and hostname from each log message, and places them as Event > headers. > Thus, a syslog message that would normally look like so (when written via > rsyslog or syslogd): > {noformat} > Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD > (/usr/local/sbin/somescript) > {noformat} > Appears in flume output as: > {noformat} > /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript) > {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2200) HTTP Source should be able to use "port" parameter if SSL is enabled
[ https://issues.apache.org/jira/browse/FLUME-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783027#comment-13783027 ] Jeff Lord commented on FLUME-2200: -- +1 I didn't see a review board link though. > HTTP Source should be able to use "port" parameter if SSL is enabled > > > Key: FLUME-2200 > URL: https://issues.apache.org/jira/browse/FLUME-2200 > Project: Flume > Issue Type: Bug >Reporter: Hari Shreedharan >Assignee: Hari Shreedharan > Attachments: FLUME-2200.patch, FLUME-2200.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (FLUME-2182) Spooling Directory Source can't ingest data completely, when a file contain some wide character, such as chinese character.
[ https://issues.apache.org/jira/browse/FLUME-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779993#comment-13779993 ] Jeff Lord commented on FLUME-2182: -- I believe this issue is a duplicate of FLUME-2052 > Spooling Directory Source can't ingest data completely, when a file contain > some wide character, such as chinese character. > --- > > Key: FLUME-2182 > URL: https://issues.apache.org/jira/browse/FLUME-2182 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.4.0 >Reporter: syntony liu >Priority: Critical > Labels: workaround > Attachments: ModifiedLineDeserializer.java > > > the bug is in ResettableFileInputStream.java: int readChar(). > if the last byte of buf is only a partial of a wide character, readChar() > shouldn't return -1(ResettableFileInputStream.java:186). it > loses the remanent data in a file. > I fix it such as: > public synchronized int readChar() throws IOException { >// if (!buf.hasRemaining()) { >if(buf.limit()- buf.position < 10){ > refillBuf(); > } > int start = buf.position(); > charBuf.clear(); > boolean isEndOfInput = false; > if (position >= fileSize) { > isEndOfInput = true; > } > CoderResult res = decoder.decode(buf, charBuf, isEndOfInput); > if (res.isMalformed() || res.isUnmappable()) { > res.throwException(); > } > int delta = buf.position() - start; > charBuf.flip(); > if (charBuf.hasRemaining()) { > char c = charBuf.get(); > // don't increment the persisted location if we are in between a > // surrogate pair, otherwise we may never recover if we seek() to this > // location! > incrPosition(delta, !Character.isHighSurrogate(c)); > return c; > // there may be a partial character in the decoder buffer > } else { > incrPosition(delta, false); > return -1; > } > } > it avoid a partial character, but have new issue. sometime, some lines of a > log file have a repeated character. > eg. >original file: 123456 >sink file: 1233456 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1666) Syslog source strips timestamp and hostname from log message body
[ https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1666: Assignee: Jeff Lord > Syslog source strips timestamp and hostname from log message body > - > > Key: FLUME-1666 > URL: https://issues.apache.org/jira/browse/FLUME-1666 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.2.0, v1.3.0 > Environment: This occurs with Flume all the way up through 1.3.0. >Reporter: Josh West >Assignee: Jeff Lord > Attachments: FLUME-1666-SyslogTextSerializer.patch > > > The syslog source parses incoming syslog messages. In the process, it strips > the timestamp and hostname from each log message, and places them as Event > headers. > Thus, a syslog message that would normally look like so (when written via > rsyslog or syslogd): > {noformat} > Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD > (/usr/local/sbin/somescript) > {noformat} > Appears in flume output as: > {noformat} > /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-2196) Flume should include the metrics based on rollbacks
Jeff Lord created FLUME-2196: Summary: Flume should include the metrics based on rollbacks Key: FLUME-2196 URL: https://issues.apache.org/jira/browse/FLUME-2196 Project: Flume Issue Type: Improvement Affects Versions: v1.4.0 Reporter: Jeff Lord We don't currently record counters for when a batch of events is rolled back. This would be nice to have for the purpose of monitoring flume for potential throughput issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-2138) Setting hdfs.useLocalTimeStamp = true without the timestamp interceptor configured will result in delivery failure
[ https://issues.apache.org/jira/browse/FLUME-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746121#comment-13746121 ] Jeff Lord commented on FLUME-2138: -- Thanks Ted. I wasn't able to reproduce this using trunk yesterday. > Setting hdfs.useLocalTimeStamp = true without the timestamp interceptor > configured will result in delivery failure > -- > > Key: FLUME-2138 > URL: https://issues.apache.org/jira/browse/FLUME-2138 > Project: Flume > Issue Type: Bug >Affects Versions: v1.3.1 >Reporter: Jeff Lord >Assignee: Jeff Lord > > hdfs.useLocalTimeStamp = true > in my config file, but the following error occurs in hdfs sink: > 013-07-29 11:09:56,923 (lifecycleSupervisor-1-0) [INFO - > org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] > Monitoried counter group for type: SINK, name: hdfsSink, registered > successfully. > 2013-07-29 11:09:56,923 (lifecycleSupervisor-1-0) [INFO - > org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] > Component type: SINK, name: hdfsSink started > 2013-07-29 11:09:56,925 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] > Polling sink runner starting > 2013-07-29 11:09:56,929 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [ERROR - > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:422)] > process failed > java.lang.NullPointerException: Expected timestamp in the Flume event > headers, but it was null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200) > > at > org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396) > > at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:662) > 2013-07-29 11:09:56,931 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] > Unable to deliver event. Exception follows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-2138) Setting hdfs.useLocalTimeStamp = true without the timestamp interceptor configured will result in delivery failure
[ https://issues.apache.org/jira/browse/FLUME-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-2138: Assignee: Jeff Lord > Setting hdfs.useLocalTimeStamp = true without the timestamp interceptor > configured will result in delivery failure > -- > > Key: FLUME-2138 > URL: https://issues.apache.org/jira/browse/FLUME-2138 > Project: Flume > Issue Type: Bug >Affects Versions: v1.3.1 > Reporter: Jeff Lord >Assignee: Jeff Lord > > hdfs.useLocalTimeStamp = true > in my config file, but the following error occurs in hdfs sink: > 013-07-29 11:09:56,923 (lifecycleSupervisor-1-0) [INFO - > org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] > Monitoried counter group for type: SINK, name: hdfsSink, registered > successfully. > 2013-07-29 11:09:56,923 (lifecycleSupervisor-1-0) [INFO - > org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] > Component type: SINK, name: hdfsSink started > 2013-07-29 11:09:56,925 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] > Polling sink runner starting > 2013-07-29 11:09:56,929 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [ERROR - > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:422)] > process failed > java.lang.NullPointerException: Expected timestamp in the Flume event > headers, but it was null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200) > > at > org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396) > > at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:662) > 2013-07-29 11:09:56,931 (SinkRunner-PollingRunner-DefaultSinkProcessor) > [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] > Unable to deliver event. Exception follows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-2138) Setting hdfs.useLocalTimeStamp = true without the timestamp interceptor configured will result in delivery failure
Jeff Lord created FLUME-2138: Summary: Setting hdfs.useLocalTimeStamp = true without the timestamp interceptor configured will result in delivery failure Key: FLUME-2138 URL: https://issues.apache.org/jira/browse/FLUME-2138 Project: Flume Issue Type: Bug Affects Versions: v1.3.1 Reporter: Jeff Lord hdfs.useLocalTimeStamp = true in my config file, but the following error occurs in hdfs sink: 013-07-29 11:09:56,923 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] Monitoried counter group for type: SINK, name: hdfsSink, registered successfully. 2013-07-29 11:09:56,923 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: SINK, name: hdfsSink started 2013-07-29 11:09:56,925 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting 2013-07-29 11:09:56,929 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:422)] process failed java.lang.NullPointerException: Expected timestamp in the Flume event headers, but it was null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:200) at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:396) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:356) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:662) 2013-07-29 11:09:56,931 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to deliver event. Exception follows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize
[ https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13711526#comment-13711526 ] Jeff Lord commented on FLUME-2006: -- Thank you for reviving this one Ashish. This is a little more than a trivial change given the fact that we will want to maintain backwards compatibility for folks that have developed against the api which have used batch-size. 1. To that end we would probably want to allow for either property name to be used and log a warning that batch-size is deprecated and batchSize is the correct name moving forward. 2. More importantly though we need to address what happens when someone has both batchSize and batch-size specified in their config? I would propose that we throw an error on start up which states that batch-size is deprecated please use batchSize. Thoughts? > in Avro the batch size is called batch-size, in all other sources batchSize > --- > > Key: FLUME-2006 > URL: https://issues.apache.org/jira/browse/FLUME-2006 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.1 >Reporter: Alexander Alten-Lorenz >Assignee: Jeff Lord >Priority: Trivial > > http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-2128) HDFS Sink rollSize is calculated based off of uncompressed size of cumulative events.
Jeff Lord created FLUME-2128: Summary: HDFS Sink rollSize is calculated based off of uncompressed size of cumulative events. Key: FLUME-2128 URL: https://issues.apache.org/jira/browse/FLUME-2128 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.3.1, v1.4.0 Reporter: Jeff Lord The hdfs sink rollSize parameter is compared against uncompressed event sizes. The net of this is that if you are using compression and expect the size of your files on HDFS to be rolled/sized based on the value set for rollSize than your files will be much smaller due to compression. We should take into account when compression is set and roll based on the compressed size on hdfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-2125) Flume User Guide/Docs should document which version of Elastic Search libs were used to build the elastic search sink and associated serializers
Jeff Lord created FLUME-2125: Summary: Flume User Guide/Docs should document which version of Elastic Search libs were used to build the elastic search sink and associated serializers Key: FLUME-2125 URL: https://issues.apache.org/jira/browse/FLUME-2125 Project: Flume Issue Type: Improvement Affects Versions: v1.3.1 Reporter: Jeff Lord -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1977) JMS Source connectionFactory property is not documented
[ https://issues.apache.org/jira/browse/FLUME-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1977: - Attachment: FLUME-1977-0.patch Patch available > JMS Source connectionFactory property is not documented > --- > > Key: FLUME-1977 > URL: https://issues.apache.org/jira/browse/FLUME-1977 > Project: Flume > Issue Type: Improvement > Components: Docs >Reporter: Brock Noland > Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1977-0.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1977) JMS Source connectionFactory property is not documented
[ https://issues.apache.org/jira/browse/FLUME-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1977: Assignee: Jeff Lord > JMS Source connectionFactory property is not documented > --- > > Key: FLUME-1977 > URL: https://issues.apache.org/jira/browse/FLUME-1977 > Project: Flume > Issue Type: Improvement > Components: Docs >Reporter: Brock Noland > Assignee: Jeff Lord >Priority: Minor > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1741) ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around
[ https://issues.apache.org/jira/browse/FLUME-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673791#comment-13673791 ] Jeff Lord commented on FLUME-1741: -- Mike and Hari I don't think this one is an issue any longer. On latest trunk the data directory is not left behind. Can you please let me know if you observe different behavior? > ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around > -- > > Key: FLUME-1741 > URL: https://issues.apache.org/jira/browse/FLUME-1741 > Project: Flume > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1741-1.patch, FLUME-1741-2.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-2056) Allow SpoolDir to pass just the filename that is the source of an event
Jeff Lord created FLUME-2056: Summary: Allow SpoolDir to pass just the filename that is the source of an event Key: FLUME-2056 URL: https://issues.apache.org/jira/browse/FLUME-2056 Project: Flume Issue Type: New Feature Reporter: Jeff Lord Currently we allow for passing of the absolute path. It would be nice to just pass the filename in the event headers and allow for using that on the hdfs sink. if (annotateFileName) { String filename = currentFile.get().getFile().getAbsolutePath(); for (Event event : events) { event.getHeaders().put(fileNameHeader, filename); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-2056) Allow SpoolDir to pass just the filename that is the source of an event
[ https://issues.apache.org/jira/browse/FLUME-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2056: - Priority: Minor (was: Major) > Allow SpoolDir to pass just the filename that is the source of an event > --- > > Key: FLUME-2056 > URL: https://issues.apache.org/jira/browse/FLUME-2056 > Project: Flume > Issue Type: New Feature > Reporter: Jeff Lord >Priority: Minor > > Currently we allow for passing of the absolute path. > It would be nice to just pass the filename in the event headers and allow for > using that on the hdfs sink. > if (annotateFileName) { > String filename = currentFile.get().getFile().getAbsolutePath(); > for (Event event : events) { > event.getHeaders().put(fileNameHeader, filename); > } > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-2006) in Avro the batch size is called batch-size, in all other sources batchSize
[ https://issues.apache.org/jira/browse/FLUME-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-2006: Assignee: Jeff Lord (was: Alexander Alten-Lorenz) > in Avro the batch size is called batch-size, in all other sources batchSize > --- > > Key: FLUME-2006 > URL: https://issues.apache.org/jira/browse/FLUME-2006 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.1 >Reporter: Alexander Alten-Lorenz >Assignee: Jeff Lord >Priority: Trivial > > http://mail-archives.apache.org/mod_mbox/flume-user/201304.mbox/%3c746ae600-7783-40f4-9817-25617370c...@gmail.com%3e -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-2004) Need to capture metrics on the Flume exec source such as events received, rejected, etc.
[ https://issues.apache.org/jira/browse/FLUME-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-2004: - Summary: Need to capture metrics on the Flume exec source such as events received, rejected, etc. (was: Need to capture metrics on the Flume source such as events received, rejected, etc.) > Need to capture metrics on the Flume exec source such as events received, > rejected, etc. > > > Key: FLUME-2004 > URL: https://issues.apache.org/jira/browse/FLUME-2004 > Project: Flume > Issue Type: New Feature > Components: Channel, Sinks+Sources >Affects Versions: v1.3.0 >Reporter: Robert Justice >Priority: Minor > > To give you a background, we have configured our flume agents with "exec" > source to tail some logs and stream it to a set of collectors. We have > monitoring setup that polls "/metrics" on these agents and verifies a few > metrics like channel size, event put count etc. The "/metrics" response > returns data about the 'channel' and 'sink' (we have an avro sink) > components, however there is no data returned about the "source" (exec) > component. We are interested in monitoring the number of events received, > rejected, etc. > I couldn't find any documentation around this; I checked the flume source and > saw the metrics are not getting captured in the code as well for 'exec' > source. > Please improve the metrics captured for the 'exec' source for flume. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-997) Support secure transport mechanism
[ https://issues.apache.org/jira/browse/FLUME-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628316#comment-13628316 ] Jeff Lord commented on FLUME-997: - Hey Joey, Were you still planning on posting the negative test here? Thanks, Jeff > Support secure transport mechanism > -- > > Key: FLUME-997 > URL: https://issues.apache.org/jira/browse/FLUME-997 > Project: Flume > Issue Type: New Feature >Affects Versions: v1.0.0 >Reporter: Mike Percy > Labels: security > Fix For: v1.4.0 > > Attachments: FLUME-997-1.patch, FLUME-997-2.patch > > > Flume needs support for a secure network transport protocol. See AVRO-898 for > a patch that made it into Avro 1.6.0 that allows us to do this relatively > easily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-1978) File Channel Encryption Docs are unclear
Jeff Lord created FLUME-1978: Summary: File Channel Encryption Docs are unclear Key: FLUME-1978 URL: https://issues.apache.org/jira/browse/FLUME-1978 Project: Flume Issue Type: Improvement Components: Docs, File Channel Affects Versions: v1.4.0 Reporter: Jeff Lord 1. Please add a link in the menu bar to jump to instructions on file channel encryption. 2. Please clean up and make the doc more clear on how to actually put this in place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1886) Add a JMS enum type to SourceType so that users don't need to enter FQCN for JMSSource
[ https://issues.apache.org/jira/browse/FLUME-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1886: Assignee: Jeff Lord > Add a JMS enum type to SourceType so that users don't need to enter FQCN for > JMSSource > -- > > Key: FLUME-1886 > URL: https://issues.apache.org/jira/browse/FLUME-1886 > Project: Flume > Issue Type: Bug > Components: Configuration >Affects Versions: v1.4.0 >Reporter: Will McQueen >Assignee: Jeff Lord >Priority: Minor > Fix For: v1.4.0 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1962) Document proper specification of lzo codec as lzop in Flume User Guide
[ https://issues.apache.org/jira/browse/FLUME-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1962: - Attachment: FLUME-1962-2.patch Added back in lzo because that will work as well. > Document proper specification of lzo codec as lzop in Flume User Guide > -- > > Key: FLUME-1962 > URL: https://issues.apache.org/jira/browse/FLUME-1962 > Project: Flume > Issue Type: Documentation > Components: Configuration, Docs >Affects Versions: v1.4.0 > Reporter: Jeff Lord >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1962-1.patch, FLUME-1962-2.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1962) Document proper specification of lzo codec as lzop in Flume User Guide
[ https://issues.apache.org/jira/browse/FLUME-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1962: - Attachment: FLUME-1962-1.patch -lzo +lzop > Document proper specification of lzo codec as lzop in Flume User Guide > -- > > Key: FLUME-1962 > URL: https://issues.apache.org/jira/browse/FLUME-1962 > Project: Flume > Issue Type: Documentation > Components: Configuration, Docs >Affects Versions: v1.4.0 > Reporter: Jeff Lord >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1962-1.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-1962) Document proper specification of lzo codec as lzop in Flume User Guide
Jeff Lord created FLUME-1962: Summary: Document proper specification of lzo codec as lzop in Flume User Guide Key: FLUME-1962 URL: https://issues.apache.org/jira/browse/FLUME-1962 Project: Flume Issue Type: Documentation Components: Configuration, Docs Affects Versions: v1.4.0 Reporter: Jeff Lord Assignee: Jeff Lord Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-1829) Update Flume Wiki and User Guide to provide clearer explanation of BatchSize, ChannelCapacity and ChannelTransactionCapacity properties.
Jeff Lord created FLUME-1829: Summary: Update Flume Wiki and User Guide to provide clearer explanation of BatchSize, ChannelCapacity and ChannelTransactionCapacity properties. Key: FLUME-1829 URL: https://issues.apache.org/jira/browse/FLUME-1829 Project: Flume Issue Type: Improvement Components: Docs Reporter: Jeff Lord Priority: Minor 1) Batch Size 1.a) When configured by client code using the flume-core-sdk , to send events to flume avro source. The flume client sdk has an appendBatch method. This will take a list of events and send them to the source as a batch. This is the size of the number of events to be passed to the source at one time. 1.b) When set as a parameter on HDFS sink (or other sinks which support BatchSize parameter) This is the number of events written to file before it is flushed to HDFS 2) 2.a) Channel Capacity This is the maximum capacity number of events of the channel. 2.b) Channel Transaction Capacity. This is the max number of events stored in the channel per transaction. How will setting these parameters to different values, affect throughput, latency in event flow? In general you will see better throughput by using memory channel as opposed to using file channel at the loss of durability. The channel capacity is going to need to be sized such that it is large enough to hold as many events as will be added to it by upstream agents. Ideal flow would see the sink draining events from the channel faster than it is having events added by its source. The channel transaction capacity will need to be smaller than the channel capacity. e.g. If your Channel capacity is set to 1 than Channel Transaction Capacity should be set to something like 100. Specifically if we have clients with varying frequency of event generation, i.e. some clients generating thousands of events/sec, while others at a much slower rate, what effect will different values of these params have on these clients ? Transaction Capacity is going to be what throttles or limits how many events the source can put into the channel. This going to vary depending on how many tiers of agents/collectors you have setup. In general though this should probably be equal to whatever you have the batch size set to in your client. With regards to the hdfs batch size, the larger your batch size the better performance will be. However, keep in mind that if a transaction fails the entire transaction will be replayed which could have the implication of duplicate events downstream. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1666) Syslog source strips timestamp and hostname from log message body
[ https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534522#comment-13534522 ] Jeff Lord commented on FLUME-1666: -- +1 for an optional setting to strip hostname and timestamp or not. > Syslog source strips timestamp and hostname from log message body > - > > Key: FLUME-1666 > URL: https://issues.apache.org/jira/browse/FLUME-1666 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.2.0, v1.3.0 > Environment: This occurs with Flume all the way up through 1.3.0. >Reporter: Josh West > Attachments: FLUME-1666-SyslogTextSerializer.patch > > > The syslog source parses incoming syslog messages. In the process, it strips > the timestamp and hostname from each log message, and places them as Event > headers. > Thus, a syslog message that would normally look like so (when written via > rsyslog or syslogd): > {noformat} > Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD > (/usr/local/sbin/somescript) > {noformat} > Appears in flume output as: > {noformat} > /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1783) flume-ng script unsets IFS
[ https://issues.apache.org/jira/browse/FLUME-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1783: Assignee: Jeff Lord > flume-ng script unsets IFS > -- > > Key: FLUME-1783 > URL: https://issues.apache.org/jira/browse/FLUME-1783 > Project: Flume > Issue Type: Bug >Reporter: Brock Noland >Assignee: Jeff Lord >Priority: Minor > > This is generally a bad idea because it assumes IFS was set to the default. > The correct idiom is to save the previous IFS and then reset it to it's > previous value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1592) Remove appendTimeout from HDFS sink docs
[ https://issues.apache.org/jira/browse/FLUME-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534208#comment-13534208 ] Jeff Lord commented on FLUME-1592: -- I only see this in one file but not in the current trunk docs ‹trunk› ╰─ grep -r appendTimeout * flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java: private void slowAppendTestHelper (long appendTimeout) throws InterruptedException, IOException, flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java: context.put("hdfs.appendTimeout", String.valueOf(appendTimeout)); Binary file flume-ng-sinks/flume-hdfs-sink/target/test-classes/org/apache/flume/sink/hdfs/TestHDFSEventSink.class matches > Remove appendTimeout from HDFS sink docs > > > Key: FLUME-1592 > URL: https://issues.apache.org/jira/browse/FLUME-1592 > Project: Flume > Issue Type: Bug >Reporter: Mike Percy >Assignee: Jeff Lord >Priority: Trivial > Fix For: v1.4.0 > > > appendTimeout is no longer used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1592) Remove appendTimeout from HDFS sink docs
[ https://issues.apache.org/jira/browse/FLUME-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1592: Assignee: Jeff Lord > Remove appendTimeout from HDFS sink docs > > > Key: FLUME-1592 > URL: https://issues.apache.org/jira/browse/FLUME-1592 > Project: Flume > Issue Type: Bug >Reporter: Mike Percy >Assignee: Jeff Lord >Priority: Trivial > Fix For: v1.4.0 > > > appendTimeout is no longer used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1741) ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around
[ https://issues.apache.org/jira/browse/FLUME-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532786#comment-13532786 ] Jeff Lord commented on FLUME-1741: -- @Brock Were you suggesting that I add the path setting in here and then just call this method? private void openLocalDiscoveryClient() { node = NodeBuilder.nodeBuilder().client(true).local(true).node(); client = node.client(); } > ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around > -- > > Key: FLUME-1741 > URL: https://issues.apache.org/jira/browse/FLUME-1741 > Project: Flume > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1741-1.patch, FLUME-1741-2.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1741) ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around
[ https://issues.apache.org/jira/browse/FLUME-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1741: - Attachment: FLUME-1741-2.patch Ok. Thank you. Here is another way to do it. Write to target and that will be cleaned up by mvn clean. > ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around > -- > > Key: FLUME-1741 > URL: https://issues.apache.org/jira/browse/FLUME-1741 > Project: Flume > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1741-1.patch, FLUME-1741-2.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1741) ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around
[ https://issues.apache.org/jira/browse/FLUME-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1741: - Attachment: FLUME-1741-1.patch Realize this sets the DEFAULT_TEMP_DIR as a constant to data. Wasn't able to find where this object is held in es. Please let me know if there is a more appropriate way to handle this. > ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around > -- > > Key: FLUME-1741 > URL: https://issues.apache.org/jira/browse/FLUME-1741 > Project: Flume > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Jeff Lord >Priority: Minor > Attachments: FLUME-1741-1.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (FLUME-1742) Mistake in code example in Flume Developer Guide, section "Developing custom components/Source"
[ https://issues.apache.org/jira/browse/FLUME-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord resolved FLUME-1742. -- Resolution: Fixed Resolving as this was fixed in FLUME-1707 > Mistake in code example in Flume Developer Guide, section "Developing custom > components/Source" > --- > > Key: FLUME-1742 > URL: https://issues.apache.org/jira/browse/FLUME-1742 > Project: Flume > Issue Type: Story > Components: Docs >Reporter: Dmitry Kholodilov >Priority: Minor > Fix For: v1.3.0 > > > It seems that there is a mistake in source code example here: > http://flume.apache.org/FlumeDeveloperGuide.html#source > {code} > // bar source > public class BarSource extends AbstractSource implements Configurable, > EventDrivenSource{ > @Override > public void configure(Context context) { > some_Param = context.get("some_param", String.class); > // process some_param … > } > @Override > public void start() { > // initialize the connection to bar client .. > } > @Override > public void stop () { > // cleanup and disconnect from bar client .. > } > @Override > public Status process() throws EventDeliveryException { > try { > // receive new data > Event e = get_some_data(); > // store the event to underlying channels(s) > getChannelProcessor().processEvent(e) > } catch (ChannelException ex) { > return Status.BACKOFF; > } > return Status.READY; > } > } > {code} > I think this class should implement interface PollableSource, not > EventDrivenSource, because the former has process() method and the latter > doesn't have it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1742) Mistake in code example in Flume Developer Guide, section "Developing custom components/Source"
[ https://issues.apache.org/jira/browse/FLUME-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1742: - Fix Version/s: v1.3.0 > Mistake in code example in Flume Developer Guide, section "Developing custom > components/Source" > --- > > Key: FLUME-1742 > URL: https://issues.apache.org/jira/browse/FLUME-1742 > Project: Flume > Issue Type: Story > Components: Docs >Reporter: Dmitry Kholodilov >Priority: Minor > Fix For: v1.3.0 > > > It seems that there is a mistake in source code example here: > http://flume.apache.org/FlumeDeveloperGuide.html#source > {code} > // bar source > public class BarSource extends AbstractSource implements Configurable, > EventDrivenSource{ > @Override > public void configure(Context context) { > some_Param = context.get("some_param", String.class); > // process some_param … > } > @Override > public void start() { > // initialize the connection to bar client .. > } > @Override > public void stop () { > // cleanup and disconnect from bar client .. > } > @Override > public Status process() throws EventDeliveryException { > try { > // receive new data > Event e = get_some_data(); > // store the event to underlying channels(s) > getChannelProcessor().processEvent(e) > } catch (ChannelException ex) { > return Status.BACKOFF; > } > return Status.READY; > } > } > {code} > I think this class should implement interface PollableSource, not > EventDrivenSource, because the former has process() method and the latter > doesn't have it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-1742) Mistake in code example in Flume Developer Guide, section "Developing custom components/Source"
[ https://issues.apache.org/jira/browse/FLUME-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526673#comment-13526673 ] Jeff Lord commented on FLUME-1742: -- This was fixed in FLUME-1707 > Mistake in code example in Flume Developer Guide, section "Developing custom > components/Source" > --- > > Key: FLUME-1742 > URL: https://issues.apache.org/jira/browse/FLUME-1742 > Project: Flume > Issue Type: Story > Components: Docs >Reporter: Dmitry Kholodilov >Priority: Minor > > It seems that there is a mistake in source code example here: > http://flume.apache.org/FlumeDeveloperGuide.html#source > {code} > // bar source > public class BarSource extends AbstractSource implements Configurable, > EventDrivenSource{ > @Override > public void configure(Context context) { > some_Param = context.get("some_param", String.class); > // process some_param … > } > @Override > public void start() { > // initialize the connection to bar client .. > } > @Override > public void stop () { > // cleanup and disconnect from bar client .. > } > @Override > public Status process() throws EventDeliveryException { > try { > // receive new data > Event e = get_some_data(); > // store the event to underlying channels(s) > getChannelProcessor().processEvent(e) > } catch (ChannelException ex) { > return Status.BACKOFF; > } > return Status.READY; > } > } > {code} > I think this class should implement interface PollableSource, not > EventDrivenSource, because the former has process() method and the latter > doesn't have it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1766) AvroSource throws confusing exception when configured without a port
[ https://issues.apache.org/jira/browse/FLUME-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1766: - Attachment: FLUME-1766-3.patch Added back in the check for bindAddress > AvroSource throws confusing exception when configured without a port > > > Key: FLUME-1766 > URL: https://issues.apache.org/jira/browse/FLUME-1766 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.0 >Reporter: Mike Percy >Assignee: Jeff Lord >Priority: Minor > Labels: newbie > Fix For: v1.4.0 > > Attachments: FLUME-1766-1.patch, FLUME-1766-2.patch, > FLUME-1766-3.patch > > > 2012-12-03 18:25:08,210 (conf-file-poller-0) [DEBUG - > org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)] > Creating instance of source src-1, type AVRO > 2012-12-03 18:25:08,235 (conf-file-poller-0) [ERROR - > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:204)] > Failed to load configuration data. Exception follows. > java.lang.NumberFormatException: null > at java.lang.Integer.parseInt(Integer.java:417) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.flume.source.AvroSource.configure(AvroSource.java:126) > at org.apache.flume.conf.Configurables.configure(Configurables.java:41) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:323) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > ^C2012-12-03 18:25:17,989 (node-shutdownHook) [INFO - > org.apache.flume.node.FlumeNode.stop(FlumeNode.java:67)] Flume node stopping > - agent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1766) AvroSource throws confusing exception when configured without a port
[ https://issues.apache.org/jira/browse/FLUME-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1766: - Attachment: FLUME-1766-2.patch Ok. Feedback has been incorporated. Regarding optional bind, it looks like this value is also assigned to hostname and there is a check in LifecycleSupervisor which looks for that. 2012-12-04 22:33:04,356 (lifecycleSupervisor-1-3) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:238)] Unable to start EventDrivenSourceRunner: { source:Avro source avrosource-1: { bindAddress: null, port: 4141 } } - Exception follows. java.lang.IllegalArgumentException: hostname can't be null > AvroSource throws confusing exception when configured without a port > > > Key: FLUME-1766 > URL: https://issues.apache.org/jira/browse/FLUME-1766 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.0 >Reporter: Mike Percy >Assignee: Jeff Lord >Priority: Minor > Labels: newbie > Fix For: v1.4.0 > > Attachments: FLUME-1766-1.patch, FLUME-1766-2.patch > > > 2012-12-03 18:25:08,210 (conf-file-poller-0) [DEBUG - > org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)] > Creating instance of source src-1, type AVRO > 2012-12-03 18:25:08,235 (conf-file-poller-0) [ERROR - > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:204)] > Failed to load configuration data. Exception follows. > java.lang.NumberFormatException: null > at java.lang.Integer.parseInt(Integer.java:417) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.flume.source.AvroSource.configure(AvroSource.java:126) > at org.apache.flume.conf.Configurables.configure(Configurables.java:41) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:323) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > ^C2012-12-03 18:25:17,989 (node-shutdownHook) [INFO - > org.apache.flume.node.FlumeNode.stop(FlumeNode.java:67)] Flume node stopping > - agent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-1766) AvroSource throws confusing exception when configured without a port
[ https://issues.apache.org/jira/browse/FLUME-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord updated FLUME-1766: - Attachment: FLUME-1766-1.patch > AvroSource throws confusing exception when configured without a port > > > Key: FLUME-1766 > URL: https://issues.apache.org/jira/browse/FLUME-1766 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.0 >Reporter: Mike Percy >Assignee: Jeff Lord >Priority: Minor > Labels: newbie > Fix For: v1.4.0 > > Attachments: FLUME-1766-1.patch > > > 2012-12-03 18:25:08,210 (conf-file-poller-0) [DEBUG - > org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)] > Creating instance of source src-1, type AVRO > 2012-12-03 18:25:08,235 (conf-file-poller-0) [ERROR - > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:204)] > Failed to load configuration data. Exception follows. > java.lang.NumberFormatException: null > at java.lang.Integer.parseInt(Integer.java:417) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.flume.source.AvroSource.configure(AvroSource.java:126) > at org.apache.flume.conf.Configurables.configure(Configurables.java:41) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:323) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > ^C2012-12-03 18:25:17,989 (node-shutdownHook) [INFO - > org.apache.flume.node.FlumeNode.stop(FlumeNode.java:67)] Flume node stopping > - agent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1766) AvroSource throws confusing exception when configured without a port
[ https://issues.apache.org/jira/browse/FLUME-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1766: Assignee: Jeff Lord > AvroSource throws confusing exception when configured without a port > > > Key: FLUME-1766 > URL: https://issues.apache.org/jira/browse/FLUME-1766 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.3.0 >Reporter: Mike Percy >Assignee: Jeff Lord >Priority: Minor > Labels: newbie > Fix For: v1.4.0 > > > 2012-12-03 18:25:08,210 (conf-file-poller-0) [DEBUG - > org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:74)] > Creating instance of source src-1, type AVRO > 2012-12-03 18:25:08,235 (conf-file-poller-0) [ERROR - > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:204)] > Failed to load configuration data. Exception follows. > java.lang.NumberFormatException: null > at java.lang.Integer.parseInt(Integer.java:417) > at java.lang.Integer.parseInt(Integer.java:499) > at org.apache.flume.source.AvroSource.configure(AvroSource.java:126) > at org.apache.flume.conf.Configurables.configure(Configurables.java:41) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:323) > at > org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38) > at > org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at > java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > ^C2012-12-03 18:25:17,989 (node-shutdownHook) [INFO - > org.apache.flume.node.FlumeNode.stop(FlumeNode.java:67)] Flume node stopping > - agent -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (FLUME-1741) ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around
[ https://issues.apache.org/jira/browse/FLUME-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Lord reassigned FLUME-1741: Assignee: Jeff Lord > ElasticSearch tests leave directory data/elasticsearch/nodes/ lying around > -- > > Key: FLUME-1741 > URL: https://issues.apache.org/jira/browse/FLUME-1741 > Project: Flume > Issue Type: Improvement >Reporter: Brock Noland >Assignee: Jeff Lord >Priority: Minor > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (FLUME-1682) Improve logging message when encrypted file channel is unable to be initialized due to invalid key/keystore
Jeff Lord created FLUME-1682: Summary: Improve logging message when encrypted file channel is unable to be initialized due to invalid key/keystore Key: FLUME-1682 URL: https://issues.apache.org/jira/browse/FLUME-1682 Project: Flume Issue Type: Improvement Affects Versions: v1.2.0 Reporter: Jeff Lord Currently if you have data in the file channel and stop flume for some reason (perhaps an upgrade) and then delete the keystore and regenerate the keystore. When flume is restarted it will throw an error similar to the following. It would be good if we could detect the reason for this failure to initialize as a change/mismatch in the keystore and report as such enabling self diagnosis and subsequent fix. 2012-10-23 09:21:32,230 ERROR file.Log: Failed to initialize Log on [channel=fileChannel] java.io.IOException: Unable to read next Transaction from log file /flume/file-channel/data10/log-10 at offset 31519759 at org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:456) at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245) at org.apache.flume.channel.file.Log.replay(Log.java:356) at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:258) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type. at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:78) at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498) at com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:438) at org.apache.flume.channel.file.proto.ProtosFactory$TransactionEventHeader$Builder.mergeFrom(ProtosFactory.java:2880) at org.apache.flume.channel.file.proto.ProtosFactory$TransactionEventHeader$Builder.mergeFrom(ProtosFactory.java:2732) at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746) at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238) at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282) at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760) at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288) at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752) at org.apache.flume.channel.file.proto.ProtosFactory$TransactionEventHeader.parseDelimitedFrom(ProtosFactory.java:2689) at org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:193) at org.apache.flume.channel.file.LogFileV3$SequentialReader.doNext(LogFileV3.java:327) at org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:452) ... 13 more 2012-10-23 09:21:32,236 ERROR file.FileChannel: Failed to start the file channel [channel=fileChannel] java.io.IOException: Unable to read next Transaction from log file /app/flume/file-channel/data1/log-1 at offset 31519759 at org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:456) at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245) at org.apache.flume.channel.file.Log.replay(Log.java:356) at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:258) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset