[ 
https://issues.apache.org/jira/browse/FLUME-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163584#comment-15163584
 ] 

ASF GitHub Bot commented on FLUME-2459:
---------------------------------------

GitHub user robin7m opened a pull request:

    https://github.com/apache/flume/pull/39

    FLUME-2459: Spooling Directory Source support for compressed files

    Flume has a fantastic source for spooling files, however, many systems 
store relevant files as compressed files. This change is to enable flume 
Spooling Directory Source to support GZip  compressed files just as if they 
were plain text (e.g. read line by line).


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/flume trunk

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #39
    
----
commit b84d01615a47c8152cfa1119a52a1a1f1b445843
Author: Hari Shreedharan <[email protected]>
Date:   2013-09-28T16:35:03Z

    FLUME-2052. Spooling directory source should be able to replace or ignore 
malformed characters
    
    (Mike Percy via Hari Shreedharan)

commit c4e2129fd12f97303a1b8120a2ecf7da456e1b77
Author: Mike Percy <[email protected]>
Date:   2013-10-04T00:25:57Z

    FLUME-2202. AsyncHBaseSink should coalesce increments to reduce RPC 
roundtrips
    
    (Hari Shreedharan via Mike Percy)

commit 9c59a309764498c013ccd202926d86413da01078
Author: Arvind Prabhakar <[email protected]>
Date:   2013-10-04T01:25:02Z

    FLUME-2191. HDFS Minicluster tests failing after protobuf upgrade.
    
    (Hari Shreedharan via Arvind Prabhakar)

commit 20eed3fdcbee57b84504ec0e1adada46950c4f90
Author: Mike Percy <[email protected]>
Date:   2013-10-08T21:00:31Z

    FLUME-2200. HTTP Source should use "port" param for both SSL & cleartext
    
    (Hari Shreedharan via Mike Percy)

commit 02fc1a8cf436dbc9327e96d21452b826978479f8
Author: Mike Percy <[email protected]>
Date:   2013-10-09T01:12:10Z

    FLUME-2208. Jetty's default SocketSelector leaks File descriptors
    
    (Hari Shreedharan via Mike Percy)

commit 1f95219ea6f87173018bde126a3485575a8ee252
Author: Mike Percy <[email protected]>
Date:   2013-10-10T01:49:31Z

    FLUME-1666. Syslog source strips timestamp and hostname from log message 
body
    
    (Jeff Lord via Mike Percy)

commit c9ddf93701e16a93f3b69355bc56545effdc7230
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-12T04:26:40Z

    FLUME-2212. Upgrade to Morphlines-0.8.0
    
    (Wolfgang Hoschek via Hari Shreedharan)

commit 68fe4d45123473adbef1077c5de20b4dd48d3a1d
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-14T22:56:54Z

    FLUME-2159. Remove TestNettyAvroRpcClient.spinThreadsCrazily.
    
    (Roshan Naik via Hari Shreedharan)

commit c420fad5d03dc8d17dce7fe3e59bf3b742f3d22d
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-16T21:24:13Z

    FLUME-2213. MorphlineInterceptor should share metric registry across 
threads for better (aggregate) reporting
    
    (Wolfgang Hoschek via Hari Shreedharan)

commit 8db5de8f85f79d91818f85a241faec5d8eee9b54
Author: Roshan Naik <[email protected]>
Date:   2013-10-21T19:06:51Z

    FLUME-2064: Typo/Grammar in flume main user doc under Scribe
    
    (Ashish Paliwal via Roshan Naik)

commit 730c822c8fd3c393558ee63b48c82bb5a0763266
Author: Mike Percy <[email protected]>
Date:   2013-10-21T19:04:22Z

    FLUME-1666. Oops, forgot new test in previous commit

commit 603bcf2d0ef0d68357d0d40e34484fbdb96aa3f9
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-25T00:47:53Z

    FLUME-2210. UnresolvedAddressException when using multiple hostNames in 
Elasticsearch sink configuration
    
    (Dib Ghosh via Hari Shreedharan)

commit f017ce5aca00d280ad6ee94e63fe3b44c326c5cf
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-25T05:18:37Z

    FLUME-2192. AbstractSinkProcessor stop incorrectly calls start
    
    (Jeremy Karlson via Hari Shreedharan)

commit 3cc8cec0ec37e6575efdbe3badcc28eceee017c0
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-31T01:30:49Z

    FLUME-1851. Fix grammatical error in Flume User Guide.
    
    (Ashish Paliwal via Hari Shreedharan)

commit 6dfe63cdcebaa5f8091b4789f4df5f679ccb3596
Author: Hari Shreedharan <[email protected]>
Date:   2013-10-31T06:13:09Z

    FLUME-2206. ElasticSearchSink ttl field modification to mimic Elasticsearch 
way of specifying TTL
    
    (Dib Ghosh via Hari Shreedharan)

commit a89897bec4e7d6f3342ed966c61668e8a8139af5
Author: Jarek Jarcec Cecho <[email protected]>
Date:   2013-10-31T19:54:10Z

    FLUME-2229. Backoff period gets reset too often in OrderSelector
    
    (Hari Shreedharan via Jarek Jarcec Cecho)

commit e026545183f577d21850162257152ba38a3f6f9f
Author: Roshan Naik <[email protected]>
Date:   2013-11-07T03:30:42Z

    FLUME-2065. Regex Extractor Interceptor config agent name inconsistent with 
rest of docs
    
    (Ashish Paliwal via Roshan Naik)

commit d3f5123c4d6cdbe4e5cca6e7e141e507bb1103a7
Author: Roshan Naik <[email protected]>
Date:   2013-11-07T19:42:05Z

    FLUME-2233. MemoryChannel lock contention on every put due to 
bytesRemaining Semaphore
    
    (Hari Shreedharan via Roshan Naik)

commit e27ae5fdce48100a85e353f97ed8a150afe5a4aa
Author: Roshan Naik <[email protected]>
Date:   2013-11-07T21:07:24Z

    FLUME-2231. Add details in Flume Ganglia config in User Guide
    
    (Ashish Paliwal via Roshan Naik)

commit 705abaf00fbf8ee69ac88cbccae47c1a33f4b4b2
Author: Jarek Jarcec Cecho <[email protected]>
Date:   2013-11-07T22:53:04Z

    FLUME-2235. idleFuture should be cancelled at the start of append
    
    (Hari Shreedharan via Jarek Jarcec Cecho)

commit c23448fc959844eece5a8ab2dbf091c2c4973a26
Author: Mike Percy <[email protected]>
Date:   2013-12-05T20:58:03Z

    FLUME-2255. Correctly handle ChannelExceptions in SpoolingDirectorySource
    
    (Hari Shreedharan via Mike Percy)

commit 2ea4922025e6db25d8627b522146b8b29c40a62b
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-09T23:12:47Z

    FLUME-2262. Log4j Appender should use timeStamp field not getTimestamp, 
which was not available in older log4j versions.
    
    (Brock Noland via Hari Shreedharan)

commit 753e4137918b5bdf559dd50a21db2a832aa1dce3
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-10T00:35:33Z

    FLUME-2238. Provide option to configure worker threads in NettyAvroRpcClient
    
    (Cameron Gandevia via Hari Shreedharan)

commit 67454a71a3aba308ff0d1b29ad3f184e5c37fee2
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-10T04:02:01Z

    FLUME-2209. AsyncHBaseSink will never recover if the column family does not 
exists for the first start.
    
    (Ashish Paliwal via Hari Shreedharan)

commit 9790ca7587060285efa4ae64591cea17dd3f00cf
Author: Mike Percy <[email protected]>
Date:   2013-12-10T22:38:06Z

    FLUME-2217. Add option to preserve all Syslog headers in syslog sources
    
    (Jeff Lord via Mike Percy)

commit d76118d729d2fe0888b934b0dc743f5f068f63dd
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-12T23:11:00Z

    FLUME-2266. Update Morphline Sink to kite-0.10.0.
    
    (Wolfgang Hoschek via Hari Shreedharan)

commit 6373032a620bdc687b6d03b12726713d08c71a10
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-13T20:35:43Z

    FLUME-2155. Index the Flume Event Queue during replay to improve replay 
time.
    
    (Brock Noland via Hari Shreedharan)

commit 58f3f6fb18e18fbf67fbd1ae0044c337845eba8d
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-13T21:23:29Z

    FLUME-1679. Add dependency on Guava to flume-ng-elasticsearch-sink POM
    
    (Andrew Purtell via Hari Shreedharan)

commit 79dc97bddbf6602c5f375337b3261f33d5555775
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-13T22:14:57Z

    FLUME-2264. Log4j Appender + Avro Reflection on string results in an 
invalid avro schema
    
    (Brock Noland via Hari Shreedharan)

commit 90bb15383c9a6d0b376c3ff5c83adade5092f8c4
Author: Hari Shreedharan <[email protected]>
Date:   2013-12-13T22:47:27Z

    FLUME-2239. Clarify File Channel's dataDirs setting in User Guide
    
    (Roshan Naik via Hari Shreedharan)

----


> Spooling Directory Source support for compressed files
> ------------------------------------------------------
>
>                 Key: FLUME-2459
>                 URL: https://issues.apache.org/jira/browse/FLUME-2459
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.5.0.1
>            Reporter: Sverre Bakke
>            Assignee: Johny Rufus
>
> Flume has a fantastic source for spooling files, however, many systems store 
> relevant files as compressed files. The Spooling Directory Source should 
> support GZip and BZip2 compressed files just as if they were plain text (e.g. 
> read line by line).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to