[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392748
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 18:08
Start Date: 25/Feb/20 18:08
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392748)
Time Spent: 17h 10m  (was: 17h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 17h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392678=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392678
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 16:51
Start Date: 25/Feb/20 16:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r383992952
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompression.java
 ##
 @@ -22,58 +22,52 @@
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
-import org.apache.hadoop.io.compress.CompressionInputStream;
-import org.apache.hadoop.io.compress.CompressionOutputStream;
 
 public class LzoCompression {
 
   /**
-   * Create a {@link CompressionInputStream} that will read from the given 
{@link InputStream} using
-   * {@link LzoCodec}.
+   * Create a {@link InputStream} that will read from the given {@link 
InputStream} using {@link
+   * LzoCodec}.
*
* @param inputStream the stream to read compressed bytes from
* @return a stream to read uncompressed bytes from
* @throws IOException
*/
-  public static CompressionInputStream createLzoInputStream(InputStream 
inputStream)
-  throws IOException {
+  public static InputStream createLzoInputStream(InputStream inputStream) 
throws IOException {
 return new LzoCodec().createInputStream(inputStream);
   }
 
   /**
-   * Create a {@link CompressionInputStream} that will read from the given 
{@link InputStream} using
-   * {@link LzopCodec}.
+   * Create a {@link InputStream} that will read from the given {@link 
InputStream} using {@link
 
 Review comment:
   ```suggestion
  * Create an {@link InputStream} that will read from the given {@link 
InputStream} using {@link
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392678)
Time Spent: 16h 40m  (was: 16.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392680=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392680
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 16:51
Start Date: 25/Feb/20 16:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r383992783
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompression.java
 ##
 @@ -22,58 +22,52 @@
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
-import org.apache.hadoop.io.compress.CompressionInputStream;
-import org.apache.hadoop.io.compress.CompressionOutputStream;
 
 public class LzoCompression {
 
   /**
-   * Create a {@link CompressionInputStream} that will read from the given 
{@link InputStream} using
-   * {@link LzoCodec}.
+   * Create a {@link InputStream} that will read from the given {@link 
InputStream} using {@link
 
 Review comment:
   ```suggestion
  * Create an {@link InputStream} that will read from the given {@link 
InputStream} using {@link
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392680)
Time Spent: 17h  (was: 16h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 17h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392679
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 16:51
Start Date: 25/Feb/20 16:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r383993750
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -83,7 +82,7 @@
 @RunWith(JUnit4.class)
 public class CompressedSourceTest {
 
-  private final double DELTA = 1e-6;
+  private final double delta = 1e-6;
 
 Review comment:
   nit: you should have declared this static and kept the capital letters 
instead of making it a member variable of CompressedSourceTest
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392679)
Time Spent: 16h 50m  (was: 16h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 16h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392677=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392677
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 16:51
Start Date: 25/Feb/20 16:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r383993034
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompression.java
 ##
 @@ -22,58 +22,52 @@
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
-import org.apache.hadoop.io.compress.CompressionInputStream;
-import org.apache.hadoop.io.compress.CompressionOutputStream;
 
 public class LzoCompression {
 
   /**
-   * Create a {@link CompressionInputStream} that will read from the given 
{@link InputStream} using
-   * {@link LzoCodec}.
+   * Create a {@link InputStream} that will read from the given {@link 
InputStream} using {@link
+   * LzoCodec}.
*
* @param inputStream the stream to read compressed bytes from
* @return a stream to read uncompressed bytes from
* @throws IOException
*/
-  public static CompressionInputStream createLzoInputStream(InputStream 
inputStream)
-  throws IOException {
+  public static InputStream createLzoInputStream(InputStream inputStream) 
throws IOException {
 return new LzoCodec().createInputStream(inputStream);
   }
 
   /**
-   * Create a {@link CompressionInputStream} that will read from the given 
{@link InputStream} using
-   * {@link LzopCodec}.
+   * Create a {@link InputStream} that will read from the given {@link 
InputStream} using {@link
+   * LzopCodec}.
*
* @param inputStream the stream to read compressed bytes from
* @return a stream to read uncompressed bytes from
* @throws IOException
*/
-  public static CompressionInputStream createLzopInputStream(InputStream 
inputStream)
-  throws IOException {
+  public static InputStream createLzopInputStream(InputStream inputStream) 
throws IOException {
 return new LzopCodec().createInputStream(inputStream);
   }
 
   /**
-   * Create a {@link CompressionOutputStream} that will write to the given 
{@link OutputStream}.
+   * Create a {@link OutputStream} that will write to the given {@link 
OutputStream}.
 
 Review comment:
   ```suggestion
  * Create an {@link OutputStream} that will write to the given {@link 
OutputStream}.
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392677)
Time Spent: 16.5h  (was: 16h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 16.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392676
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 16:47
Start Date: 25/Feb/20 16:47
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590955514
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392676)
Time Spent: 16h 20m  (was: 16h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 16h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392508=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392508
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 11:51
Start Date: 25/Feb/20 11:51
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590828387
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392508)
Time Spent: 16h 10m  (was: 16h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392507
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 11:49
Start Date: 25/Feb/20 11:49
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590828387
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392507)
Time Spent: 16h  (was: 15h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 16h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392503=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392503
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 11:46
Start Date: 25/Feb/20 11:46
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590824053
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392503)
Time Spent: 15h 50m  (was: 15h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392496=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392496
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 11:36
Start Date: 25/Feb/20 11:36
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590824053
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392496)
Time Spent: 15h 40m  (was: 15.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392372
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 08:29
Start Date: 25/Feb/20 08:29
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590744759
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392372)
Time Spent: 15.5h  (was: 15h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 15.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392370
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 08:28
Start Date: 25/Feb/20 08:28
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590744759
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392370)
Time Spent: 15h 20m  (was: 15h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392369
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 08:27
Start Date: 25/Feb/20 08:27
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590744228
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392369)
Time Spent: 15h 10m  (was: 15h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392368
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 25/Feb/20 08:26
Start Date: 25/Feb/20 08:26
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590744228
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392368)
Time Spent: 15h  (was: 14h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 15h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392107
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 21:33
Start Date: 24/Feb/20 21:33
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590559659
 
 
   That sounds great. Should have caught that earlier.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392107)
Time Spent: 14h 40m  (was: 14.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392108=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392108
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 21:33
Start Date: 24/Feb/20 21:33
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590559659
 
 
   That sounds great. I should have caught that earlier.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392108)
Time Spent: 14h 50m  (was: 14h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392102=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392102
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 21:28
Start Date: 24/Feb/20 21:28
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590557571
 
 
   @lukecwik We Observed replacing Compression I/O stream with java.io I/O 
stream in LzoCompression.java can resolve the issue. Should we go ahead and do 
that? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392102)
Time Spent: 14.5h  (was: 14h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=392084=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392084
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 20:53
Start Date: 24/Feb/20 20:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590543238
 
 
   WordCount doesn't depend on using LZO so it shouldn't be a dependency and 
the pipeline should execute successfully without it. The test may be picking up 
a legitimate case which users would hit as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392084)
Time Spent: 14h 20m  (was: 14h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391852
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 17:42
Start Date: 24/Feb/20 17:42
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590459393
 
 
   @lukecwik  Do we need to add test dependency for facebook-presto and airlift 
in  /beam/examples/java/build.gradle
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391852)
Time Spent: 14h  (was: 13h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391853=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391853
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 17:43
Start Date: 24/Feb/20 17:43
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590459393
 
 
   @lukecwik  Do we need to add test dependency for facebook-presto and airlift 
in  /beam/examples/java/build.gradle ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391853)
Time Spent: 14h 10m  (was: 14h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391826
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 16:43
Start Date: 24/Feb/20 16:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590426460
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391826)
Time Spent: 13h 40m  (was: 13.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391825
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 16:43
Start Date: 24/Feb/20 16:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590426399
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391825)
Time Spent: 13.5h  (was: 13h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391827
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 24/Feb/20 16:43
Start Date: 24/Feb/20 16:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590426524
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391827)
Time Spent: 13h 50m  (was: 13h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391369
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 23/Feb/20 13:43
Start Date: 23/Feb/20 13:43
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590069911
 
 
   retest this please
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391369)
Time Spent: 13h 20m  (was: 13h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391368
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 23/Feb/20 13:39
Start Date: 23/Feb/20 13:39
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-590069911
 
 
   retest this please
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391368)
Time Spent: 13h 10m  (was: 13h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391194=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391194
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 22/Feb/20 17:27
Start Date: 22/Feb/20 17:27
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589978924
 
 
   `Run Java_Examples_Dataflow PreCommit`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391194)
Time Spent: 13h  (was: 12h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391193=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391193
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 22/Feb/20 17:27
Start Date: 22/Feb/20 17:27
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589978924
 
 
   `Run Java_Examples_Dataflow PreCommit`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391193)
Time Spent: 12h 50m  (was: 12h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391171=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391171
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 22/Feb/20 15:55
Start Date: 22/Feb/20 15:55
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589968838
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391171)
Time Spent: 12h 40m  (was: 12.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=391170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391170
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 22/Feb/20 15:52
Start Date: 22/Feb/20 15:52
Worklog Time Spent: 10m 
  Work Description: shubham-srivastav commented on issue #10254: 
[BEAM-8564] Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589968838
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 391170)
Time Spent: 12.5h  (was: 12h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=390970=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390970
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 21/Feb/20 23:28
Start Date: 21/Feb/20 23:28
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589879728
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 390970)
Time Spent: 12h 20m  (was: 12h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=390969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390969
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 21/Feb/20 23:28
Start Date: 21/Feb/20 23:28
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589879703
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 390969)
Time Spent: 12h 10m  (was: 12h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=390963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390963
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 21/Feb/20 23:16
Start Date: 21/Feb/20 23:16
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589876952
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 390963)
Time Spent: 12h  (was: 11h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=390962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390962
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 21/Feb/20 23:13
Start Date: 21/Feb/20 23:13
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589876072
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 390962)
Time Spent: 11h 50m  (was: 11h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389912=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389912
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 20/Feb/20 13:15
Start Date: 20/Feb/20 13:15
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-589016725
 
 
   > After committing the comments, you may need to run spotlessApply again.
   
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389912)
Time Spent: 11h 40m  (was: 11.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389532
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:52
Start Date: 19/Feb/20 17:52
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-588354735
 
 
   After committing the comments, you may need to run spotlessApply again.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389532)
Time Spent: 11.5h  (was: 11h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389524
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381436101
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
 
 Review comment:
   ```suggestion
  * LZO compression using LZO codec. {@code .lzo_deflate} extension is 
specified for files which
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389524)
Time Spent: 11h  (was: 10h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389525
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381437508
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
+   * .lzo_deflate files without {@code airlift/aircompressor} and {@code 
presto-hadoop-apache2}
 
 Review comment:
   ```suggestion
  * {@code .lzo_deflate} files without {@code io.airlift:aircompressor} and 
{@code com.facebook.presto.hadoop:hadoop-apache2}
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389525)
Time Spent: 11h  (was: 10h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389527
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381438625
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
+   * .lzo_deflate files without {@code airlift/aircompressor} and {@code 
presto-hadoop-apache2}
+   * loaded will result in {@code NoClassDefFoundError} at runtime.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
+   * and headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZOP 
compression by default,
+   * so it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write .lzo files
 
 Review comment:
   ```suggestion
  * io.airlift:aircompressor} and {@code 
com.facebook.presto.hadoop:hadoop-apache2}. Attempts to read or write {@code 
.lzo} files
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389527)
Time Spent: 11h 10m  (was: 11h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389531
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381439166
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
+   * .lzo_deflate files without {@code airlift/aircompressor} and {@code 
presto-hadoop-apache2}
+   * loaded will result in {@code NoClassDefFoundError} at runtime.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
+   * and headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZOP 
compression by default,
+   * so it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write .lzo files
+   * without {@code airlift/aircompressor} and {@code presto-hadoop-apache2} 
loaded will result in
 
 Review comment:
   ```suggestion
  * without {@code io.airlift:aircompressor} and {@code 
com.facebook.presto.hadoop:hadoop-apache2} loaded will result in a
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389531)
Time Spent: 11h 20m  (was: 11h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389529=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389529
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381437988
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
+   * .lzo_deflate files without {@code airlift/aircompressor} and {@code 
presto-hadoop-apache2}
+   * loaded will result in {@code NoClassDefFoundError} at runtime.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
 
 Review comment:
   ```suggestion
  * LZOP compression using LZOP codec. {@code .lzo} extension is specified 
for files with magic bytes
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389529)
Time Spent: 11h 10m  (was: 11h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389526
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381440755
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
+   * .lzo_deflate files without {@code airlift/aircompressor} and {@code 
presto-hadoop-apache2}
+   * loaded will result in {@code NoClassDefFoundError} at runtime.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(
+  
LzoCompression.createLzoOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
+   * and headers.
+   *
 
 Review comment:
   ```suggestion
  *
  * Warning: The LZOP codec being used does not support 
concatenated LZOP streams and will
  * silently ignore data after the end of the first LZOP stream.
  *
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389526)
Time Spent: 11h 10m  (was: 11h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389530=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389530
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381437634
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
+   * .lzo_deflate files without {@code airlift/aircompressor} and {@code 
presto-hadoop-apache2}
+   * loaded will result in {@code NoClassDefFoundError} at runtime.
 
 Review comment:
   ```suggestion
  * loaded will result in a {@code NoClassDefFoundError} at runtime.
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389530)
Time Spent: 11h 20m  (was: 11h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389528=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389528
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:51
Start Date: 19/Feb/20 17:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381437087
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +153,54 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   *
+   * The Beam Java SDK does not pull in the required libraries for LZO 
compression by default, so
+   * it is the user's responsibility to declare an explicit dependency on 
{@code
+   * airlift/aircompressor} and {@code presto-hadoop-apache2}. Attempts to 
read or write
 
 Review comment:
   ```suggestion
  * io.airlift:aircompressor} and {@code 
com.facebook.presto.hadoop:hadoop-apache2}. Attempts to read or write
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389528)
Time Spent: 11h 10m  (was: 11h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389517=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389517
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:41
Start Date: 19/Feb/20 17:41
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r381435599
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   Best to add that comment over the LZOP enum since nobody reading the 
documentation is going to find the comment in the tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389517)
Time Spent: 10h 50m  (was: 10h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389515
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 19/Feb/20 17:38
Start Date: 19/Feb/20 17:38
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-588347547
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389515)
Time Spent: 10h 40m  (was: 10.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389114
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:37
Start Date: 18/Feb/20 21:37
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-587893601
 
 
   @lukecwik I've incorporated mostly all the suggested changes in the PR. 
Please let me know your thoughts on this.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389114)
Time Spent: 10.5h  (was: 10h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389111
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:33
Start Date: 18/Feb/20 21:33
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380947952
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   For now we have added a comment warning users that a concatenated lzo file 
doesn't gets decompressed correctly. Its added above 
testFalseReadConcatenatedLzop and testFalseReadMultiStreamLzop methods. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389111)
Time Spent: 10h 20m  (was: 10h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389097=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389097
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:14
Start Date: 18/Feb/20 21:14
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938634
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -91,4 +95,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compileOnly 'io.airlift:aircompressor:0.16' 
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389097)
Time Spent: 10h  (was: 9h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389096
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:14
Start Date: 18/Feb/20 21:14
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938566
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -91,4 +95,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compileOnly 'io.airlift:aircompressor:0.16' 
+  compileOnly 'com.facebook.presto.hadoop:hadoop-apache2:3.2.0-1'
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389096)
Time Spent: 9h 50m  (was: 9h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389098
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:14
Start Date: 18/Feb/20 21:14
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938720
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -58,6 +58,10 @@ test {
   }
 }
 
+configurations {
+testCompile.extendsFrom compileOnly
+}
+
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389098)
Time Spent: 10h 10m  (was: 10h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389094
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:14
Start Date: 18/Feb/20 21:14
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938458
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -91,4 +95,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compileOnly 'io.airlift:aircompressor:0.16' 
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389094)
Time Spent: 9h 40m  (was: 9.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389090
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:13
Start Date: 18/Feb/20 21:13
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938051
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompressorInputStream.java
 ##
 @@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.util;
+
+import io.airlift.compress.lzo.LzoCodec;
+import java.io.IOException;
+import java.io.InputStream;
+import org.apache.commons.compress.compressors.CompressorInputStream;
+import org.apache.commons.compress.utils.CountingInputStream;
+import org.apache.commons.compress.utils.IOUtils;
+import org.apache.commons.compress.utils.InputStreamStatistics;
+
+/**
+ * {@link CompressorInputStream} implementation to create LZO encoded stream. 
Library relies on https://github.com/airlift/aircompressor/;>LZO
+ *
+ * @since 1.18
+ */
+public class LzoCompressorInputStream extends CompressorInputStream
 
 Review comment:
   replaced wrapper classes with static methods
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389090)
Time Spent: 9h 10m  (was: 9h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389093
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:13
Start Date: 18/Feb/20 21:13
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938387
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +156,38 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389093)
Time Spent: 9.5h  (was: 9h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389092
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:13
Start Date: 18/Feb/20 21:13
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380938292
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +156,38 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzoCompressorInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzoCompressorOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
+   * and headers.
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389092)
Time Spent: 9h 20m  (was: 9h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389085=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389085
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:12
Start Date: 18/Feb/20 21:12
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380937429
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
+  assertEquals(1.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(1, reader.getSplitPointsConsumed());
+  assertEquals(0, reader.getSplitPointsRemaining());
+}
+  }
+
+  @Test
+  public void testEmptyLzoProgress() throws IOException {
+File tmpFile = tmpFolder.newFile("empty.lzo_deflate");
+String filename = tmpFile.toPath().toString();
+writeFile(tmpFile, new byte[0], CompressionMode.LZO);
+
+PipelineOptions options = PipelineOptionsFactory.create();
+CompressedSource source =
+CompressedSource.from(new ByteSource(filename, 
1)).withDecompression(CompressionMode.LZO);
+try (BoundedReader readerOrig = source.createReader(options)) {
+  assertThat(readerOrig, instanceOf(CompressedReader.class));
+  CompressedReader reader = (CompressedReader) readerOrig;
+  // before starting
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+  // confirm empty
+  assertFalse(reader.start());
+  // after reading empty source
+  assertEquals(1.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(0, reader.getSplitPointsRemaining());
+}
+  }
+
+  @Test
+  public void testLzoProgress() throws IOException {
+int numRecords = 3;
+File tmpFile = tmpFolder.newFile("nonempty.lzo");
+String filename = tmpFile.toPath().toString();
+writeFile(tmpFile, new byte[numRecords], CompressionMode.LZO);
+
+PipelineOptions options = PipelineOptionsFactory.create();
+CompressedSource source =
+CompressedSource.from(new ByteSource(filename, 
1)).withDecompression(CompressionMode.LZO);
+try (BoundedReader readerOrig = source.createReader(options)) {
+  assertThat(readerOrig, instanceOf(CompressedReader.class));
+  CompressedReader reader = (CompressedReader) readerOrig;
+  // before starting
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389085)
Time Spent: 9h  (was: 8h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: 

[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=389084=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389084
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 18/Feb/20 21:11
Start Date: 18/Feb/20 21:11
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r380937373
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -755,7 +1212,7 @@ public void testGzipProgress() throws IOException {
   assertFalse(reader.advance());
 
   // after reading empty source
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 389084)
Time Spent: 8h 50m  (was: 8h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=387187=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-387187
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 14/Feb/20 07:40
Start Date: 14/Feb/20 07:40
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-586136772
 
 
   @lukecwik we are working on all the suggestions provided by you, will be 
updating the PR in a few days. Thank you for your patience.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 387187)
Time Spent: 8h 40m  (was: 8.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=386873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-386873
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 13/Feb/20 20:57
Start Date: 13/Feb/20 20:57
Worklog Time Spent: 10m 
  Work Description: gsteelman commented on issue #10254: [BEAM-8564] Add 
LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-585967896
 
 
   Hi @amoght can you address the open comments? Have reached out to a couple 
more folks internally at Twitter to request some more eyes on this. Hoping to 
get this wrapped up soon. Thank you for your work so far. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 386873)
Time Spent: 8.5h  (was: 8h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=384662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384662
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 10/Feb/20 19:21
Start Date: 10/Feb/20 19:21
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-584305044
 
 
   I was pinging about whether there was any recent work to address my last 
review's comments?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384662)
Time Spent: 8h 20m  (was: 8h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-02-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=384129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384129
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Feb/20 18:24
Start Date: 09/Feb/20 18:24
Worklog Time Spent: 10m 
  Work Description: nownikhil commented on issue #10254: [BEAM-8564] Add 
LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-583876691
 
 
   LGTM
   - Twitter Core data libraries team
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384129)
Time Spent: 8h 10m  (was: 8h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=376620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-376620
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 23/Jan/20 22:21
Start Date: 23/Jan/20 22:21
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-577905297
 
 
   Ping, any updates?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 376620)
Time Spent: 8h  (was: 7h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=373224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373224
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Jan/20 19:26
Start Date: 16/Jan/20 19:26
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r367605664
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   It would be better if the concatenated streams for LZOP worked or if 
concatenated streams were detected then an exception was thrown to the user. 
Having a check that ensures the number of bytes read from the stream/channel is 
equivalent to the channels length would be one way of supporting this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373224)
Time Spent: 7h 50m  (was: 7h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=373223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373223
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Jan/20 19:25
Start Date: 16/Jan/20 19:25
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r367605664
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   It would be better if the concatenated streams for LZOP worked or if 
concatenated streams were detected then an exception was thrown to the user.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373223)
Time Spent: 7h 40m  (was: 7.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=373222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373222
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Jan/20 19:24
Start Date: 16/Jan/20 19:24
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r367605058
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompressorInputStream.java
 ##
 @@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.util;
+
+import io.airlift.compress.lzo.LzoCodec;
+import java.io.IOException;
+import java.io.InputStream;
+import org.apache.commons.compress.compressors.CompressorInputStream;
+import org.apache.commons.compress.utils.CountingInputStream;
+import org.apache.commons.compress.utils.IOUtils;
+import org.apache.commons.compress.utils.InputStreamStatistics;
+
+/**
+ * {@link CompressorInputStream} implementation to create LZO encoded stream. 
Library relies on https://github.com/airlift/aircompressor/;>LZO
+ *
+ * @since 1.18
+ */
+public class LzoCompressorInputStream extends CompressorInputStream
 
 Review comment:
   A class called LzoCompression in util is fine.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373222)
Time Spent: 7.5h  (was: 7h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=373189=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373189
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Jan/20 17:55
Start Date: 16/Jan/20 17:55
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r367564910
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   line 325: public void testReadConcatenatedLzo() throws IOException:
   Is this for LZOP codec? Since unlike LZOP, LZO Codec supports file 
concatenation. Also we have testReadMultiStreamLzo and 
testFalseReadConcatenatedLzop
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373189)
Time Spent: 7h 20m  (was: 7h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=373178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373178
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Jan/20 17:52
Start Date: 16/Jan/20 17:52
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r367563291
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompressorInputStream.java
 ##
 @@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.util;
+
+import io.airlift.compress.lzo.LzoCodec;
+import java.io.IOException;
+import java.io.InputStream;
+import org.apache.commons.compress.compressors.CompressorInputStream;
+import org.apache.commons.compress.utils.CountingInputStream;
+import org.apache.commons.compress.utils.IOUtils;
+import org.apache.commons.compress.utils.InputStreamStatistics;
+
+/**
+ * {@link CompressorInputStream} implementation to create LZO encoded stream. 
Library relies on https://github.com/airlift/aircompressor/;>LZO
+ *
+ * @since 1.18
+ */
+public class LzoCompressorInputStream extends CompressorInputStream
 
 Review comment:
   Where do you suggest to keep static methods? Currently we have wrapper class 
in util package.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 373178)
Time Spent: 7h 10m  (was: 7h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369349
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:53
Start Date: 09/Jan/20 20:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364954499
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -755,7 +1212,7 @@ public void testGzipProgress() throws IOException {
   assertFalse(reader.advance());
 
   // after reading empty source
 
 Review comment:
   ```suggestion
 // after reading source
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369349)
Time Spent: 6h 50m  (was: 6h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369350
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:53
Start Date: 09/Jan/20 20:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364954346
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   ```suggestion
 // after reading source
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369350)
Time Spent: 6h 50m  (was: 6h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369348
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:53
Start Date: 09/Jan/20 20:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364949936
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -235,6 +315,30 @@ public void testReadConcatenatedGzip() throws IOException 
{
 assertEquals(Bytes.asList(expected), actual);
   }
 
+  /**
+   * Using Lzo Codec Test a concatenation of lzo files is correctly 
decompressed.
+   *
+   * A concatenation of lzo files as one file is a valid lzo file and 
should decompress to be the
+   * concatenation of those individual files.
 
 Review comment:
   The closing  tag isn't needed in javadoc even if your editor is 
inserting it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369348)
Time Spent: 6h 40m  (was: 6.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369352=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369352
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:53
Start Date: 09/Jan/20 20:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364954411
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -738,7 +1069,133 @@ public void testGzipProgress() throws IOException {
   assertThat(readerOrig, instanceOf(CompressedReader.class));
   CompressedReader reader = (CompressedReader) readerOrig;
   // before starting
-  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
+  assertEquals(1.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(1, reader.getSplitPointsConsumed());
+  assertEquals(0, reader.getSplitPointsRemaining());
+}
+  }
+
+  @Test
+  public void testEmptyLzoProgress() throws IOException {
+File tmpFile = tmpFolder.newFile("empty.lzo_deflate");
+String filename = tmpFile.toPath().toString();
+writeFile(tmpFile, new byte[0], CompressionMode.LZO);
+
+PipelineOptions options = PipelineOptionsFactory.create();
+CompressedSource source =
+CompressedSource.from(new ByteSource(filename, 
1)).withDecompression(CompressionMode.LZO);
+try (BoundedReader readerOrig = source.createReader(options)) {
+  assertThat(readerOrig, instanceOf(CompressedReader.class));
+  CompressedReader reader = (CompressedReader) readerOrig;
+  // before starting
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+  // confirm empty
+  assertFalse(reader.start());
+  // after reading empty source
+  assertEquals(1.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(0, reader.getSplitPointsRemaining());
+}
+  }
+
+  @Test
+  public void testLzoProgress() throws IOException {
+int numRecords = 3;
+File tmpFile = tmpFolder.newFile("nonempty.lzo");
+String filename = tmpFile.toPath().toString();
+writeFile(tmpFile, new byte[numRecords], CompressionMode.LZO);
+
+PipelineOptions options = PipelineOptionsFactory.create();
+CompressedSource source =
+CompressedSource.from(new ByteSource(filename, 
1)).withDecompression(CompressionMode.LZO);
+try (BoundedReader readerOrig = source.createReader(options)) {
+  assertThat(readerOrig, instanceOf(CompressedReader.class));
+  CompressedReader reader = (CompressedReader) readerOrig;
+  // before starting
+  assertEquals(0.0, reader.getFractionConsumed(), DELTA);
+  assertEquals(0, reader.getSplitPointsConsumed());
+  assertEquals(1, reader.getSplitPointsRemaining());
+
+  // confirm has three records
+  for (int i = 0; i < numRecords; ++i) {
+if (i == 0) {
+  assertTrue(reader.start());
+} else {
+  assertTrue(reader.advance());
+}
+assertEquals(0, reader.getSplitPointsConsumed());
+assertEquals(1, reader.getSplitPointsRemaining());
+  }
+  assertFalse(reader.advance());
+
+  // after reading empty source
 
 Review comment:
   ```suggestion
 // after reading source
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369352)
Time Spent: 7h  (was: 6h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  

[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369351
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:53
Start Date: 09/Jan/20 20:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364953576
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -235,6 +315,30 @@ public void testReadConcatenatedGzip() throws IOException 
{
 assertEquals(Bytes.asList(expected), actual);
   }
 
+  /**
+   * Using Lzo Codec Test a concatenation of lzo files is correctly 
decompressed.
+   *
+   * A concatenation of lzo files as one file is a valid lzo file and 
should decompress to be the
+   * concatenation of those individual files.
+   */
+  @Test
+  public void testReadConcatenatedLzo() throws IOException {
 
 Review comment:
   Can we either add support for multistream or throw an exception if the 
stream isn't finished?
   
   It would be dangerous for users to have part of their data silently dropped 
in this scenario. We should also add to the comment that concatenated streams 
aren't supported.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369351)
Time Spent: 6h 50m  (was: 6h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369327
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364922097
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +156,38 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
 
 Review comment:
   Please add to this comment telling people what dependencies they need to 
pull in similar to the comment to zstd above.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369327)
Time Spent: 6h 20m  (was: 6h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369326
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364922208
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +156,38 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzoCompressorInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzoCompressorOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
+   * and headers.
 
 Review comment:
   Please add to this comment telling people what dependencies they need to 
pull in similar to the comment to zstd above.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369326)
Time Spent: 6h 20m  (was: 6h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369324=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369324
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364921304
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -91,4 +95,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compileOnly 'io.airlift:aircompressor:0.16' 
 
 Review comment:
   ```suggestion
 provided 'io.airlift:aircompressor:0.16' 
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369324)
Time Spent: 6h  (was: 5h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369321
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364920670
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -58,6 +58,10 @@ test {
   }
 }
 
+configurations {
+testCompile.extendsFrom compileOnly
+}
 
 Review comment:
   ```suggestion
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369321)
Time Spent: 5h 40m  (was: 5.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369320
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364920514
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -58,6 +58,10 @@ test {
   }
 }
 
+configurations {
 
 Review comment:
   ```suggestion
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369320)
Time Spent: 5.5h  (was: 5h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369319
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364920618
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -58,6 +58,10 @@ test {
   }
 }
 
+configurations {
+testCompile.extendsFrom compileOnly
 
 Review comment:
   ```suggestion
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369319)
Time Spent: 5.5h  (was: 5h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369323=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369323
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364921379
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -91,4 +95,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compileOnly 'io.airlift:aircompressor:0.16' 
+  compileOnly 'com.facebook.presto.hadoop:hadoop-apache2:3.2.0-1'
 
 Review comment:
   ```suggestion
 provided 'com.facebook.presto.hadoop:hadoop-apache2:3.2.0-1'
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369323)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369328
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364935674
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompressorInputStream.java
 ##
 @@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.util;
+
+import io.airlift.compress.lzo.LzoCodec;
+import java.io.IOException;
+import java.io.InputStream;
+import org.apache.commons.compress.compressors.CompressorInputStream;
+import org.apache.commons.compress.utils.CountingInputStream;
+import org.apache.commons.compress.utils.IOUtils;
+import org.apache.commons.compress.utils.InputStreamStatistics;
+
+/**
+ * {@link CompressorInputStream} implementation to create LZO encoded stream. 
Library relies on https://github.com/airlift/aircompressor/;>LZO
+ *
+ * @since 1.18
+ */
+public class LzoCompressorInputStream extends CompressorInputStream
 
 Review comment:
   Instead of creating a wrapper class that delegates to lzoIS, create a static 
method which returns the LZO and LZOP input and output streams and invoke the 
appropriate static method from the enum readDecompressed/writeDecompressed
   
   Putting the code into a static method will prevent the LzoCodec/LzopCodec 
from being loaded till the static method is called.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369328)
Time Spent: 6h 20m  (was: 6h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369329
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364924751
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Compression.java
 ##
 @@ -152,6 +156,38 @@ public WritableByteChannel 
writeCompressed(WritableByteChannel channel) throws I
 }
   },
 
+  /**
+   * LZO compression using LZO Codec. .lzo_deflate extension is specified for 
the files which just
+   * use the LZO algorithm without headers.
+   */
+  LZO(".lzo_deflate", ".lzo_deflate") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzoCompressorInputStream(Channels.newInputStream(channel)));
+}
+
+@Override
+public WritableByteChannel writeCompressed(WritableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzoCompressorOutputStream(Channels.newOutputStream(channel)));
+}
+  },
+
+  /**
+   * LZOP compression using LZOP Codec. .lzo extension is specified for the 
files with magic bytes
+   * and headers.
+   */
+  LZOP(".lzo", ".lzo") {
+@Override
+public ReadableByteChannel readDecompressed(ReadableByteChannel channel) 
throws IOException {
+  return Channels.newChannel(new 
LzopCompressorInputStream(Channels.newInputStream(channel)));
 
 Review comment:
   Why do you need LzoCompressorInputStream class at all?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369329)
Time Spent: 6.5h  (was: 6h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369322
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364920736
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -58,6 +58,10 @@ test {
   }
 }
 
+configurations {
+testCompile.extendsFrom compileOnly
+}
+
 
 Review comment:
   ```suggestion
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369322)
Time Spent: 5h 50m  (was: 5h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=369325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-369325
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 20:05
Start Date: 09/Jan/20 20:05
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r364921654
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -91,4 +95,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compileOnly 'io.airlift:aircompressor:0.16' 
 
 Review comment:
   Please group the provided dependencies that have been added here with the 
ones above.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 369325)
Time Spent: 6h 10m  (was: 6h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2020-01-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=368934=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-368934
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 09/Jan/20 10:44
Start Date: 09/Jan/20 10:44
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-572504197
 
 
   @lukecwik I've updated the PR based on the discussion that we had. Please 
let me know your thoughts and suggestions.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 368934)
Time Spent: 5h 20m  (was: 5h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=360598=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-360598
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Dec/19 23:59
Start Date: 16/Dec/19 23:59
Worklog Time Spent: 10m 
  Work Description: gsteelman commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r358529085
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -761,6 +1043,132 @@ public void testGzipProgress() throws IOException {
 }
   }
 
+  @Test
+  public void testEmptyLzoProgress() throws IOException {
+File tmpFile = tmpFolder.newFile("empty.lzo_deflate");
+String filename = tmpFile.toPath().toString();
+writeFile(tmpFile, new byte[0], CompressionMode.LZO);
+
+PipelineOptions options = PipelineOptionsFactory.create();
+CompressedSource source =
+CompressedSource.from(new ByteSource(filename, 
1)).withDecompression(CompressionMode.LZO);
+try (BoundedReader readerOrig = source.createReader(options)) {
+  assertThat(readerOrig, instanceOf(CompressedReader.class));
+  CompressedReader reader = (CompressedReader) readerOrig;
+  // before starting
+  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
 
 Review comment:
   I think we can add the constant for `CompressedSourceTest.java` at least. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 360598)
Time Spent: 5h 10m  (was: 5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=360597=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-360597
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Dec/19 23:58
Start Date: 16/Dec/19 23:58
Worklog Time Spent: 10m 
  Work Description: gsteelman commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r358528923
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -235,6 +315,30 @@ public void testReadConcatenatedGzip() throws IOException 
{
 assertEquals(Bytes.asList(expected), actual);
   }
 
+  /**
+   * Using Lzo Codec Test a concatenation of lzo files is correctly 
decompressed.
+   *
+   * A concatenation of lzo files as one file is a valid lzo file and 
should decompress to be the
+   * concatenation of those individual files.
+   */
+  @Test
+  public void testReadConcatenatedLzo() throws IOException {
 
 Review comment:
   Perhaps it would be a good idea to add a test with an expected failure then? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 360597)
Time Spent: 5h  (was: 4h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=360154=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-360154
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 16/Dec/19 10:08
Start Date: 16/Dec/19 10:08
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-565992871
 
 
   > @amoght I don't have enough context to make the call on that, as I am very 
new to Beam. I have reached out to some others at Twitter to also review this 
change, as they will have more context.
   
   Thanks Gary :) appreciate your help! 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 360154)
Time Spent: 4h 50m  (was: 4h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=359647=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-359647
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 13/Dec/19 20:56
Start Date: 13/Dec/19 20:56
Worklog Time Spent: 10m 
  Work Description: gsteelman commented on issue #10254: [BEAM-8564] Add 
LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-565605851
 
 
   @amoght I don't have enough context to make the call on that, as I am very 
new to Beam. I have reached out to some others at Twitter to also review this 
change, as they will have more context. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 359647)
Time Spent: 4h 40m  (was: 4.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=359607=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-359607
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 13/Dec/19 19:30
Start Date: 13/Dec/19 19:30
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-565577445
 
 
   @gsteelman we have used the airlift/aircompressor library to only get the 
compression and decompression mechanism, the implementation of Input/Output 
stream there introduces the transitive dependency, which can be removed and 
replaced with apache hadoop common library. This significantly reduces the size 
as well.
   So, here are the 2 possible options:
   1) We only use the compression and decompression mechanism from 
airlift/aircompressor and design the Input/Output Streams for beam accordingly. 
This will be needed to be updated if there is any change in those classes on 
airlift/aircompressor's end. But, since we will only be using the compression 
and decompression mechanism from airlift/aircompressor, the updates will be 
small and quite rare. Therefore, this won't be that big of an issue.
   2) We introduce LZO as an optional package for beam. As this will give users 
the option to manage their beam size (if it is a constraint) or if LZO is not 
required.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 359607)
Time Spent: 4.5h  (was: 4h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=359023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-359023
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 12/Dec/19 23:48
Start Date: 12/Dec/19 23:48
Worklog Time Spent: 10m 
  Work Description: gsteelman commented on issue #10254: [BEAM-8564] Add 
LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-565238477
 
 
   > While studying the code, we found that the airlift/ aircompressor library 
only requires some classes which are also present in apache hadoop common 
package(~3.9MB). Therefore, we are now thinking that of making changes in the 
airlift/ aircompressor package, replacing the
   > com.facebook.presto.hadoop with org.apache.hadoop.common and removing 
other compression mechanisms present in the airlift/aircompressor package(like 
zstd, gzip etc) while only keeping the required LZO package.
   > But if we go ahead with this approach, we will have to manually update 
this library whenever any changes are made to the airlift/aircompressor's LZO 
package.
   > @lukecwik @gsteelman please provide your thoughts on this.
   
   Is it possible to instead add the dependencies on the `apache.hadoop.common` 
package directly in these changes, and not add a dependency on 
airlift/aircompressor this change? I would prefer to stick with strict 
dependencies when possible, rather than relying on transitive dependencies to 
bring in the classes we need.
   
   Relying on the transitive dependencies brought in by airlift/aircompressor 
has its own set of issues, including having to update our libraries whenever 
changes are made to airlift. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 359023)
Time Spent: 4h 20m  (was: 4h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=358540=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-358540
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 12/Dec/19 10:28
Start Date: 12/Dec/19 10:28
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-564256222
 
 
   While studying the code, we found that the airlift/ aircompressor library 
only requires some classes which are also present in apache hadoop common 
package(~3.9MB). Therefore, we are now thinking that of making changes in the 
airlift/ aircompressor package, replacing the 
   com.facebook.presto.hadoop with org.apache.hadoop.common and removing other 
compression mechanisms present in the airlift/aircompressor package(like zstd, 
gzip etc) while only keeping the required LZO package.
   But if we go ahead with this approach, we will have to manually update this 
library whenever any changes are made to the airlift/aircompressor's LZO 
package.
   @lukecwik @gsteelman please provide your thoughts on this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 358540)
Time Spent: 4h 10m  (was: 4h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=357405=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-357405
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 10/Dec/19 21:01
Start Date: 10/Dec/19 21:01
Worklog Time Spent: 10m 
  Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO 
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-564256222
 
 
   While studying the code, we found that the airlift/ aircompressor library 
only requires some classes which are also present in apache hadoop common 
package(~3.9MB). Therefore, we are now thinking that if we make changes in the 
airlift/ aircompressor package, replacing the 
   com.facebook.presto.hadoop with org.apache.hadoop.common and remove other 
compression mechanisms(like zstd, gzip etc) while only keeping the required LZO 
package.
   But if we go ahead with this approach, we will have to manually update this 
library whenever any changes are made to the airlift/aircompressor's LZO 
package.
   @lukecwik @gsteelman please provide your thoughts on this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 357405)
Time Spent: 4h  (was: 3h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354532
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 18:03
Start Date: 05/Dec/19 18:03
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354464601
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -161,6 +189,28 @@ public void testGzipSplittable() throws Exception {
 assertFalse(source.isSplittable());
   }
 
+  /** Test splittability of files in LZO mode -- none should be splittable. */
+  @Test
+  public void testLzoSplittable() throws Exception {
 
 Review comment:
   Thanks for pointing this out, this has been added.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354532)
Time Spent: 3h 50m  (was: 3h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354515
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 17:46
Start Date: 05/Dec/19 17:46
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354456382
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -761,6 +1043,132 @@ public void testGzipProgress() throws IOException {
 }
   }
 
+  @Test
+  public void testEmptyLzoProgress() throws IOException {
+File tmpFile = tmpFolder.newFile("empty.lzo_deflate");
+String filename = tmpFile.toPath().toString();
+writeFile(tmpFile, new byte[0], CompressionMode.LZO);
+
+PipelineOptions options = PipelineOptionsFactory.create();
+CompressedSource source =
+CompressedSource.from(new ByteSource(filename, 
1)).withDecompression(CompressionMode.LZO);
+try (BoundedReader readerOrig = source.createReader(options)) {
+  assertThat(readerOrig, instanceOf(CompressedReader.class));
+  CompressedReader reader = (CompressedReader) readerOrig;
+  // before starting
+  assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
 
 Review comment:
   It can be done. But that would require altering all the tests that use this 
constant value. Will that be fine if we do that?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354515)
Time Spent: 3.5h  (was: 3h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354516
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 17:46
Start Date: 05/Dec/19 17:46
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354451206
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -235,6 +315,30 @@ public void testReadConcatenatedGzip() throws IOException 
{
 assertEquals(Bytes.asList(expected), actual);
   }
 
+  /**
+   * Using Lzo Codec Test a concatenation of lzo files is correctly 
decompressed.
+   *
+   * A concatenation of lzo files as one file is a valid lzo file and 
should decompress to be the
+   * concatenation of those individual files.
 
 Review comment:
   This is happening when we run the spotlessApply task. When the  tag is 
closed, the spotlessCheck fails. Not sure of the reason behind that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354516)
Time Spent: 3h 40m  (was: 3.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354499
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 17:40
Start Date: 05/Dec/19 17:40
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354453765
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -267,6 +371,69 @@ public void testReadMultiStreamBzip2() throws IOException 
{
 verifyReadContents(output, tmpFile, mode);
   }
 
+  /**
+   * Test a lzo file containing multiple streams is correctly decompressed.
+   *
+   * A lzo file may contain multiple streams and should decompress as the 
concatenation of those
+   * streams.
 
 Review comment:
   This is happening due to spotlessApply.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354499)
Time Spent: 3h 20m  (was: 3h 10m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354498
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 17:39
Start Date: 05/Dec/19 17:39
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354453765
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -267,6 +371,69 @@ public void testReadMultiStreamBzip2() throws IOException 
{
 verifyReadContents(output, tmpFile, mode);
   }
 
+  /**
+   * Test a lzo file containing multiple streams is correctly decompressed.
+   *
+   * A lzo file may contain multiple streams and should decompress as the 
concatenation of those
+   * streams.
 
 Review comment:
   This is happening during spotlessApply.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354498)
Time Spent: 3h 10m  (was: 3h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354497
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 17:38
Start Date: 05/Dec/19 17:38
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354453135
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -235,6 +315,30 @@ public void testReadConcatenatedGzip() throws IOException 
{
 assertEquals(Bytes.asList(expected), actual);
   }
 
+  /**
+   * Using Lzo Codec Test a concatenation of lzo files is correctly 
decompressed.
+   *
+   * A concatenation of lzo files as one file is a valid lzo file and 
should decompress to be the
+   * concatenation of those individual files.
+   */
+  @Test
+  public void testReadConcatenatedLzo() throws IOException {
 
 Review comment:
   The current behaviour of LZOP codec is that it returns the contents of the 
first file only, if concatenated files are given because of the presence of 
headers. This causes the test to fail. That is why we have not added this test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354497)
Time Spent: 3h  (was: 2h 50m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354492
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 17:34
Start Date: 05/Dec/19 17:34
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354451206
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
 ##
 @@ -235,6 +315,30 @@ public void testReadConcatenatedGzip() throws IOException 
{
 assertEquals(Bytes.asList(expected), actual);
   }
 
+  /**
+   * Using Lzo Codec Test a concatenation of lzo files is correctly 
decompressed.
+   *
+   * A concatenation of lzo files as one file is a valid lzo file and 
should decompress to be the
+   * concatenation of those individual files.
 
 Review comment:
   This is happening when we run the spotlessApply task. When the  tag is 
clossed, the spotlessCheck fails. Not sure of the reason behind that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354492)
Time Spent: 2h 50m  (was: 2h 40m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354462
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 16:55
Start Date: 05/Dec/19 16:55
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354431147
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -90,4 +90,6 @@ dependencies {
   shadowTest library.java.avro_tests
   shadowTest library.java.zstd_jni
   testRuntimeOnly library.java.slf4j_jdk14
+  compile 'io.airlift:aircompressor:0.16'
+  compile 'com.facebook.presto.hadoop:hadoop-apache2:3.2.0-1'
 
 Review comment:
   This is included because LzoCodec class that has been used to create 
Input streams is using some classes of the org.apache.hadoop package, 
which is a part of com.facebook.presto.hadoop.
   Since the aircompressor is designed to also support optional hadoop 
configurations, hadoop is coming into picture(in our case, hadoop config is 
null).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354462)
Time Spent: 2h 40m  (was: 2.5h)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> compression algorithm. 
> This will include the following functionalities:
>  # compress() : for compressing files into an LZO archive
>  # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with 
> LZO files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8564) Add LZO compression and decompression support

2019-12-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=354447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-354447
 ]

ASF GitHub Bot logged work on BEAM-8564:


Author: ASF GitHub Bot
Created on: 05/Dec/19 16:40
Start Date: 05/Dec/19 16:40
Worklog Time Spent: 10m 
  Work Description: amoght commented on pull request #10254: [BEAM-8564] 
Add LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#discussion_r354422677
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/LzoCompressorInputStream.java
 ##
 @@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.util;
+
+import io.airlift.compress.lzo.LzoCodec;
+import java.io.IOException;
+import java.io.InputStream;
+import org.apache.commons.compress.compressors.CompressorInputStream;
+import org.apache.commons.compress.utils.CountingInputStream;
+import org.apache.commons.compress.utils.IOUtils;
+import org.apache.commons.compress.utils.InputStreamStatistics;
+
+/**
+ * {@link CompressorInputStream} implementation to create LZO encoded stream. 
Library relies on https://github.com/airlift/aircompressor/;>LZO
+ *
+ * @since 1.18
+ */
+public class LzoCompressorInputStream extends CompressorInputStream
+implements InputStreamStatistics {
+
+  private final CountingInputStream countingStream;
+  private final InputStream lzoIS;
+
+  /**
+   * Wraps the given stream into a aircompressor's HadoopLzoInputStream using 
the LzoCodec.
+   *
+   * @param inStream the stream to write to
+   * @throws IOException if aircompressor does
+   */
+  public LzoCompressorInputStream(final InputStream inStream) throws 
IOException {
+this.lzoIS =
+new LzoCodec().createInputStream(countingStream = new 
CountingInputStream(inStream));
+  }
+
+  @Override
+  public int available() throws IOException {
+return lzoIS.available();
+  }
+
+  @Override
+  public void close() throws IOException {
+lzoIS.close();
+  }
+
+  @Override
+  public int read(final byte[] b) throws IOException {
+return read(b, 0, b.length);
+  }
+
+  @Override
+  public long skip(final long n) throws IOException {
+return IOUtils.skip(lzoIS, n);
+  }
+
+  @Override
+  public void mark(final int readlimit) {
+lzoIS.mark(readlimit);
+  }
+
+  @Override
+  public boolean markSupported() {
+return lzoIS.markSupported();
+  }
+
+  @Override
+  public int read() throws IOException {
+final int ret = lzoIS.read();
+count(ret == -1 ? 0 : 1);
+return ret;
+  }
+
+  @Override
+  public int read(final byte[] buf, final int off, final int len) throws 
IOException {
+if (len == 0) {
+  return 0;
+}
+final int ret = lzoIS.read(buf, off, len);
 
 Review comment:
   No, this case is getting handled. This check has been put simply for the 
reason that if buffer length is 0, the read method doesn't even get executed 
and is handled here itself. Basically, to avoid unnecessary method call 
overhead.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 354447)
Time Spent: 2.5h  (was: 2h 20m)

> Add LZO compression and decompression support
> -
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Amogh Tiwari
>Assignee: Amogh Tiwari
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression 
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO 
> 

  1   2   >