[ 
https://issues.apache.org/jira/browse/HADOOP-19855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18071427#comment-18071427
 ] 

ASF GitHub Bot commented on HADOOP-19855:
-----------------------------------------

pan3793 commented on code in PR #8399:
URL: https://github.com/apache/hadoop/pull/8399#discussion_r3039411442


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/zstd/ZStandardDecompressor.java:
##########
@@ -262,35 +260,24 @@ private int populateUncompressedBuffer(byte[] b, int off, 
int len, int n) {
     return n;
   }
 
-  private native static void initIDs();
-  private native static long create();
-  private native static void init(long stream);
-  private native int inflateBytesDirect(ByteBuffer src, int srcOffset,
-      int srcLen, ByteBuffer dst, int dstOffset, int dstLen);
-  private native static void free(long strm);
-  private native static int getStreamSize();
-
-  int inflateDirect(ByteBuffer src, ByteBuffer dst) throws IOException {
-    assert
-        (this instanceof ZStandardDecompressor.ZStandardDirectDecompressor);
-
-    int originalPosition = dst.position();
-    int n = inflateBytesDirect(
-        src, src.position(), src.limit(), dst, dst.position(),
-        dst.limit()
-    );
-    dst.position(originalPosition + n);
-    if (bytesInCompressedBuffer > 0) {
-      src.position(compressedDirectBufOff);
+  int inflateDirect(ByteBuffer src, ByteBuffer dst) {
+    assert (this instanceof ZStandardDecompressor.ZStandardDirectDecompressor);
+
+    // zstd-jni: use streaming decompression directly on the provided buffers
+    int origDstPos = dst.position();
+    boolean done = zstdJniCtx.decompressDirectByteBufferStream(dst, src);
+    if (done) {
+      finished = true;
+      remaining = 0;
     } else {
-      src.position(src.limit());
+      remaining = src.limit() - src.position();
     }

Review Comment:
   aligned `finished`/`remaining` logic with `decompress()` - only set 
`finished` when `done && remaining==0`





> Use zstd-jni in ZStandardCodec
> ------------------------------
>
>                 Key: HADOOP-19855
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19855
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: compress
>    Affects Versions: 3.5.0
>            Reporter: Cheng Pan
>            Priority: Major
>              Labels: pull-request-available
>
> In Hadoop, we use native libs for zstd codec which has several disadvantages:
>  * It requires native *libhadoop* and *libzstd* to be installed in system 
> {*}LD_LIBRARY_PATH{*}, and they have to be installed separately on each node 
> of the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [zstd-jni|https://github.com/luben/zstd-jni] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to