[ https://issues.apache.org/jira/browse/SPARK-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354919#comment-16354919 ]
Sean Owen commented on SPARK-23347: ----------------------------------- Yes, but that's the opposite of what this JIRA suggests is a problem. This is only a problem if write(byte) is called instead of the bulk write method. Where are you suggesting that happens? > Introduce buffer between Java data stream and gzip stream > --------------------------------------------------------- > > Key: SPARK-23347 > URL: https://issues.apache.org/jira/browse/SPARK-23347 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Ted Yu > Priority: Minor > > Currently GZIPOutputStream is used directly around ByteArrayOutputStream > e.g. from KVStoreSerializer : > {code} > ByteArrayOutputStream bytes = new ByteArrayOutputStream(); > GZIPOutputStream out = new GZIPOutputStream(bytes); > {code} > This seems inefficient. > GZIPOutputStream does not implement the write(byte) method. It only provides > a write(byte[], offset, len) method, which calls the corresponding JNI zlib > function. > BufferedOutputStream can be introduced wrapping GZIPOutputStream for better > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org