wgtmac commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1116420531


##########
pom.xml:
##########
@@ -151,6 +151,9 @@
     <module>parquet-scala</module>
     <module>parquet-thrift</module>
     <module>parquet-hadoop-bundle</module>
+    <!--
+    <module>plugins/parquet-encoding-vector</module>

Review Comment:
   This means user has to remove the comment to enable it which seems not very 
friendly. We can use `<profile>` to enable module optionally.



##########
parquet-encoding/src/main/java/org/apache/parquet/column/values/bitpacking/Packer.java:
##########
@@ -86,6 +105,7 @@ private static Object getStaticField(String className, 
String fieldName) {
   static IntPackerFactory leIntPackerFactory = 
getIntPackerFactory("LemireBitPackingLE");
   static BytePackerFactory beBytePackerFactory = 
getBytePackerFactory("ByteBitPackingBE");
   static BytePackerFactory leBytePackerFactory = 
getBytePackerFactory("ByteBitPackingLE");
+  static BytePackerFactory leBytePacker512VectorFactory = null;

Review Comment:
   Please add a comment to explain leBytePacker512VectorFactory will be 
initialized lazily.



##########
plugins/parquet-encoding-vector/pom.xml:
##########
@@ -0,0 +1,131 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+<project xmlns="http://maven.apache.org/POM/4.0.0";
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+  <parent>
+    <groupId>org.apache.parquet</groupId>
+    <artifactId>parquet</artifactId>
+    <version>1.13.0-SNAPSHOT</version>
+    <relativePath>../../pom.xml</relativePath>
+  </parent>
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>parquet-encoding-vector</artifactId>
+  <packaging>jar</packaging>
+
+  <name>Apache Parquet Encodings Vector</name>
+  <url>https://parquet.apache.org</url>
+
+  <properties>
+    <maven.compiler.source>17</maven.compiler.source>

Review Comment:
   If we add a new profile in the root pom file, then we can set compiler 
version there and avoid hardcode here.



##########
README.md:
##########
@@ -83,6 +83,16 @@ Parquet is a very active project, and new features are being 
added quickly. Here
 * Column stats
 * Delta encoding
 * Index pages
+* Java Vector API support
+
+## Java Vector API support
+Parquet-MR has supported Java Vector API to speed up reading, to enable the 
function:
+* Java 17+, 64-bit
+* For Intel CPU, Flags containing avx512vbmi and avx512_vbmi2 can have better 
performance gains(ICE Lake or newer processor).
+* mvn clean package -P java17-target -P vector

Review Comment:
   Please update the document to reflect the latest change.



##########
parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/ByteBitPackingVectorBenchmarks.java:
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.benchmarks;
+
+import org.apache.parquet.column.values.bitpacking.BytePacker;
+import org.apache.parquet.column.values.bitpacking.Packer;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Level;
+import org.openjdk.jmh.annotations.Measurement;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.annotations.State;
+import org.openjdk.jmh.annotations.Warmup;
+
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This class uses the java17 vector API, add VM options 
--add-modules=jdk.incubator.vector
+ */
+
+@State(Scope.Benchmark)
+@BenchmarkMode(Mode.AverageTime)
+@Warmup(iterations = 1, batchSize = 100000)
+@Measurement(iterations = 1, batchSize = 100000)
+@OutputTimeUnit(TimeUnit.MILLISECONDS)
+public class ByteBitPackingVectorBenchmarks {

Review Comment:
   If the parquet-encoding-vector does not build, what will happen to this 
file? Does it fail?



##########
plugins/parquet-encoding-vector/src/main/java/org/apache/parquet/column/values/bitpacking/ByteBitPacking512VectorLE.java:
##########
@@ -0,0 +1,3010 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.parquet.column.values.bitpacking;
+
+import jdk.incubator.vector.ByteVector;
+import jdk.incubator.vector.IntVector;
+import jdk.incubator.vector.LongVector;
+import jdk.incubator.vector.ShortVector;
+import jdk.incubator.vector.Vector;
+import jdk.incubator.vector.VectorMask;
+import jdk.incubator.vector.VectorOperators;
+import jdk.incubator.vector.VectorShuffle;
+import jdk.incubator.vector.VectorSpecies;
+
+import java.nio.ByteBuffer;
+
+/**
+ *

Review Comment:
   Remove it or add something?



##########
plugins/parquet-encoding-vector/src/main/java/org/apache/parquet/column/values/bitpacking/VectorSupport.java:
##########
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.values.bitpacking;
+
+/**
+ * Vector bits which has be supported

Review Comment:
   ```suggestion
    * Supported bit widths to use Vector API.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to