[jira] [Assigned] (FLUME-2192) AbstractSinkProcessor stop incorrectly calls start

2013-10-01 Thread Arvind Prabhakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvind Prabhakar reassigned FLUME-2192:
---

Assignee: Jeremy Karlson

> AbstractSinkProcessor stop incorrectly calls start
> --
>
> Key: FLUME-2192
> URL: https://issues.apache.org/jira/browse/FLUME-2192
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.4.0, v1.3.1
>Reporter: Jeremy Karlson
>Assignee: Jeremy Karlson
> Fix For: v1.4.1, v1.5.0
>
> Attachments: FLUME-2192.patch
>
>
> AbstractSinkProcessor incorrectly calls start when trying to stop.  Patch is 
> attached.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (FLUME-2199) Flume builds with new version require mvn install before site can be generated

2013-10-01 Thread Arvind Prabhakar (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783567#comment-13783567
 ] 

Arvind Prabhakar commented on FLUME-2199:
-

Thanks for the patch Andrew. Do you mind publishing a review request?

> Flume builds with new version require mvn install before site can be generated
> --
>
> Key: FLUME-2199
> URL: https://issues.apache.org/jira/browse/FLUME-2199
> Project: Flume
>  Issue Type: Bug
>  Components: Build
>Affects Versions: v1.4.0
>Reporter: Andrew Bayer
>Assignee: Andrew Bayer
> Fix For: v1.5.0
>
> Attachments: FLUME-2199.patch
>
>
> At this point, if you change the version for Flume, you need to run a mvn 
> install before you can run with -Psite (or, for that matter, javadoc:javadoc) 
> enabled. This is because the top-level POM in flume.git/pom.xml is both the 
> parent POM and the root of the reactor - since it's the parent, it's got to 
> run before any of the children that inherit from it, but site generation 
> should be running *after* all the children, so that it probably pulls in the 
> reactor's build of each child module, rather than having to pull in one 
> already installed/deployed before the build starts.
> There are a bunch of other reasons to split parent POM and top-level POM, but 
> that's the biggest one right there. 
> Also, the javadoc jar generation is a bit messed up - every module's javadoc 
> jar contains not only its own javadocs but the javadocs for every Flume 
> module it depends on. That, again, may make sense in a site context for the 
> top-level, but not for the individual modules. This results in unnecessary 
> bloat in the javadoc jars, and unnecessary time spent downloading the 
> "*-javadoc-resources.jar" for every dependency each module has, due to how 
> the javadoc plugin works. Also the whole site generation per-module thing, 
> which I am not a fan of in most cases. I don't think it's needed here. 
> Tweaking the site plugin not to run anywhere but the top-level and the 
> javadoc plugin to not do the dependency aggregation anywhere but the 
> top-level should make a big difference on build speed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (FLUME-2199) Flume builds with new version require mvn install before site can be generated

2013-10-01 Thread Arvind Prabhakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvind Prabhakar reassigned FLUME-2199:
---

Assignee: Andrew Bayer

> Flume builds with new version require mvn install before site can be generated
> --
>
> Key: FLUME-2199
> URL: https://issues.apache.org/jira/browse/FLUME-2199
> Project: Flume
>  Issue Type: Bug
>  Components: Build
>Affects Versions: v1.4.0
>Reporter: Andrew Bayer
>Assignee: Andrew Bayer
> Fix For: v1.5.0
>
> Attachments: FLUME-2199.patch
>
>
> At this point, if you change the version for Flume, you need to run a mvn 
> install before you can run with -Psite (or, for that matter, javadoc:javadoc) 
> enabled. This is because the top-level POM in flume.git/pom.xml is both the 
> parent POM and the root of the reactor - since it's the parent, it's got to 
> run before any of the children that inherit from it, but site generation 
> should be running *after* all the children, so that it probably pulls in the 
> reactor's build of each child module, rather than having to pull in one 
> already installed/deployed before the build starts.
> There are a bunch of other reasons to split parent POM and top-level POM, but 
> that's the biggest one right there. 
> Also, the javadoc jar generation is a bit messed up - every module's javadoc 
> jar contains not only its own javadocs but the javadocs for every Flume 
> module it depends on. That, again, may make sense in a site context for the 
> top-level, but not for the individual modules. This results in unnecessary 
> bloat in the javadoc jars, and unnecessary time spent downloading the 
> "*-javadoc-resources.jar" for every dependency each module has, due to how 
> the javadoc plugin works. Also the whole site generation per-module thing, 
> which I am not a fan of in most cases. I don't think it's needed here. 
> Tweaking the site plugin not to run anywhere but the top-level and the 
> javadoc plugin to not do the dependency aggregation anywhere but the 
> top-level should make a big difference on build speed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 14439: Syslog source strips timestamp and hostname from log message body

2013-10-01 Thread Jeff jlord

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14439/
---

Review request for Flume.


Repository: flume-git


Description
---

Attaching a patch which introduces a boolean keepFields which defaults to 
false. When set to true this will preserve the timestamp and hostname in the 
body of the event. Additionally I have added a test for SyslogTcpSource


Diffs
-

  
flume-ng-core/src/main/java/org/apache/flume/source/SyslogSourceConfigurationConstants.java
 5a73c88 
  flume-ng-core/src/main/java/org/apache/flume/source/SyslogTcpSource.java 
db9e0fd 
  flume-ng-core/src/main/java/org/apache/flume/source/SyslogUtils.java c2a29a1 
  flume-ng-core/src/test/java/org/apache/flume/source/TestSyslogTcpSource.java 
PRE-CREATION 
  flume-ng-core/src/test/java/org/apache/flume/source/TestSyslogUdpSource.java 
2d7a429 
  flume-ng-core/src/test/java/org/apache/flume/source/TestSyslogUtils.java 
7208464 
  flume-ng-doc/sphinx/FlumeDeveloperGuide.rst 2be9c68 
  flume-ng-doc/sphinx/FlumeUserGuide.rst dac3ce7 

Diff: https://reviews.apache.org/r/14439/diff/


Testing
---


Thanks,

Jeff jlord



[jira] [Updated] (FLUME-1666) Syslog source strips timestamp and hostname from log message body

2013-10-01 Thread Jeff Lord (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLUME-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Lord updated FLUME-1666:
-

Attachment: FLUME-1666-1.patch

Attaching a patch which introduces a boolean keepFields which defaults to 
false. When set to true this will preserve the timestamp and hostname in the 
body of the event. Additionally I have added a test for SyslogTcpSource

> Syslog source strips timestamp and hostname from log message body
> -
>
> Key: FLUME-1666
> URL: https://issues.apache.org/jira/browse/FLUME-1666
> Project: Flume
>  Issue Type: Bug
>  Components: Sinks+Sources
>Affects Versions: v1.2.0, v1.3.0
> Environment: This occurs with Flume all the way up through 1.3.0.
>Reporter: Josh West
>Assignee: Jeff Lord
> Attachments: FLUME-1666-1.patch, FLUME-1666-SyslogTextSerializer.patch
>
>
> The syslog source parses incoming syslog messages.  In the process, it strips 
> the timestamp and hostname from each log message, and places them as Event 
> headers.
> Thus, a syslog message that would normally look like so (when written via 
> rsyslog or syslogd):
> {noformat}
> Wed Oct 24 09:18:01 UTC 2012 someserver /USR/SBIN/CRON[26981]: (root) CMD 
> (/usr/local/sbin/somescript)
> {noformat}
> Appears in flume output as:
> {noformat}
> /USR/SBIN/CRON[26981]: (root) CMD (/usr/local/sbin/somescript)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (FLUME-2202) AsyncHBaseSink should coalesce increments to reduce RPC roundtrips

2013-10-01 Thread Hari Shreedharan (JIRA)
Hari Shreedharan created FLUME-2202:
---

 Summary: AsyncHBaseSink should coalesce increments to reduce RPC 
roundtrips
 Key: FLUME-2202
 URL: https://issues.apache.org/jira/browse/FLUME-2202
 Project: Flume
  Issue Type: Improvement
Reporter: Hari Shreedharan
Assignee: Hari Shreedharan






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (FLUME-2155) Improve replay time

2013-10-01 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783235#comment-13783235
 ] 

Hari Shreedharan commented on FLUME-2155:
-

Looks like my last patch was incomplete (as I had created a branch locally and 
I probably took the diff not based on trunk, sorry!). I will add more tests and 
update the patch

> Improve replay time
> ---
>
> Key: FLUME-2155
> URL: https://issues.apache.org/jira/browse/FLUME-2155
> Project: Flume
>  Issue Type: Bug
>Reporter: Hari Shreedharan
>Assignee: Hari Shreedharan
> Attachments: 10-11, 1-2, 30-31, 
> 70-71, fc-test.patch, FLUME-2155-initial.patch, SmartReplay1.1.pdf, 
> SmartReplay.pdf
>
>
> File Channel has scaled so well that people now run channels with sizes in 
> 100's of millions of events. Turns out, replay can be crazy slow even between 
> checkpoints at this scale - because of the remove() method in FlumeEventQueue 
> moving every pointer that follows the one being removed (1 remove causes 99 
> million+ moves for a channel of 100 million!). There are several ways of 
> improving - one being move at the end of replay - sort of like a compaction. 
> Another is to use the fact that all removes happen from the top of the queue, 
> so move the first "k" events out to hashset and remove from there - we can 
> find k using the write id of the last checkpoint and the current one. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (FLUME-2200) HTTP Source should be able to use "port" parameter if SSL is enabled

2013-10-01 Thread Jeff Lord (JIRA)

[ 
https://issues.apache.org/jira/browse/FLUME-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783027#comment-13783027
 ] 

Jeff Lord commented on FLUME-2200:
--

+1 I didn't see a review board link though.

> HTTP Source should be able to use "port" parameter if SSL is enabled
> 
>
> Key: FLUME-2200
> URL: https://issues.apache.org/jira/browse/FLUME-2200
> Project: Flume
>  Issue Type: Bug
>Reporter: Hari Shreedharan
>Assignee: Hari Shreedharan
> Attachments: FLUME-2200.patch, FLUME-2200.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)