[ https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389531&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389531 ]
ASF GitHub Bot logged work on BEAM-8564: ---------------------------------------- Author: ASF GitHub Bot Created on: 19/Feb/20 17:51 Start Date: 19/Feb/20 17:51 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10254: [BEAM-8564] Add LZO compression and decompression support URL: https://github.com/apache/beam/pull/10254#discussion_r381439166 ########## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java ########## @@ -152,6 +153,54 @@ public WritableByteChannel writeCompressed(WritableByteChannel channel) throws I } }, + /** + * LZO compression using LZO Codec. .lzo_deflate extension is specified for the files which just + * use the LZO algorithm without headers. + * + * <p>The Beam Java SDK does not pull in the required libraries for LZO compression by default, so + * it is the user's responsibility to declare an explicit dependency on {@code + * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to read or write + * .lzo_deflate files without {@code airlift/aircompressor} and {@code presto-hadoop-apache2} + * loaded will result in {@code NoClassDefFoundError} at runtime. + */ + LZO(".lzo_deflate", ".lzo_deflate") { + @Override + public ReadableByteChannel readDecompressed(ReadableByteChannel channel) throws IOException { + return Channels.newChannel( + LzoCompression.createLzoInputStream(Channels.newInputStream(channel))); + } + + @Override + public WritableByteChannel writeCompressed(WritableByteChannel channel) throws IOException { + return Channels.newChannel( + LzoCompression.createLzoOutputStream(Channels.newOutputStream(channel))); + } + }, + + /** + * LZOP compression using LZOP Codec. .lzo extension is specified for the files with magic bytes + * and headers. + * + * <p>The Beam Java SDK does not pull in the required libraries for LZOP compression by default, + * so it is the user's responsibility to declare an explicit dependency on {@code + * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to read or write .lzo files + * without {@code airlift/aircompressor} and {@code presto-hadoop-apache2} loaded will result in Review comment: ```suggestion * without {@code io.airlift:aircompressor} and {@code com.facebook.presto.hadoop:hadoop-apache2} loaded will result in a ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 389531) Time Spent: 11h 20m (was: 11h 10m) > Add LZO compression and decompression support > --------------------------------------------- > > Key: BEAM-8564 > URL: https://issues.apache.org/jira/browse/BEAM-8564 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core > Reporter: Amogh Tiwari > Assignee: Amogh Tiwari > Priority: Minor > Time Spent: 11h 20m > Remaining Estimate: 0h > > LZO is a lossless data compression algorithm which is focused on compression > and decompression speeds. > This will enable Apache Beam sdk to compress/decompress files using LZO > compression algorithm. > This will include the following functionalities: > # compress() : for compressing files into an LZO archive > # decompress() : for decompressing files archived using LZO compression > Appropriate Input and Output stream will also be added to enable working with > LZO files. -- This message was sent by Atlassian Jira (v8.3.4#803005)