[GitHub] [orc] dongjoon-hyun commented on pull request #795: ORC-935: Bump commons-csv from 1.8 to 1.9.0

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #795:
URL: https://github.com/apache/orc/pull/795#issuecomment-896546565


   Merged to main for Apache ORC 1.8.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun merged pull request #795: ORC-935: Bump commons-csv from 1.8 to 1.9.0

2021-08-10 Thread GitBox


dongjoon-hyun merged pull request #795:
URL: https://github.com/apache/orc/pull/795


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ORC-935) Bump commons-csv from 1.8 to 1.9.0

2021-08-10 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created ORC-935:
-

 Summary: Bump commons-csv from 1.8 to 1.9.0
 Key: ORC-935
 URL: https://issues.apache.org/jira/browse/ORC-935
 Project: ORC
  Issue Type: Sub-task
  Components: Java
Affects Versions: 1.8.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [orc] dongjoon-hyun commented on pull request #819: Bump netty-all from 4.1.42.Final to 4.1.66.Final in /java

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #819:
URL: https://github.com/apache/orc/pull/819#issuecomment-896531684


   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun edited a comment on pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


dongjoon-hyun edited a comment on pull request #848:
URL: https://github.com/apache/orc/pull/848#issuecomment-896530405


   Merged to main/1.7/1.6.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #848:
URL: https://github.com/apache/orc/pull/848#issuecomment-896530405


   Merged to main/1.7.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun merged pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


dongjoon-hyun merged pull request #848:
URL: https://github.com/apache/orc/pull/848


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #848:
URL: https://github.com/apache/orc/pull/848#issuecomment-896524268


   The changes look feasible. Let's wait and see the CI results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896510073


   You are added to the Apache ORC contributor group and I assigned ORC-933 to 
you, @krystalics .
   Welcome again!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun merged pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun merged pull request #847:
URL: https://github.com/apache/orc/pull/847


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on a change in pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686497550



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.orc.examples;
+
+import java.io.IOException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.MapColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.RecordReader;
+import org.apache.orc.TypeDescription;
+
+/**
+ * This example shows how to read compound data types in ORC.
+ */
+public class AdvancedReader {
+
+  public static void main(Configuration conf, String[] args) throws 
IOException {
+// Get the information from the file footer
+Reader reader = OrcFile.createReader(new Path("advanced-example.orc"),
+OrcFile.readerOptions(conf));
+System.out.println("File schema: " + reader.getSchema());
+System.out.println("Row count: " + reader.getNumberOfRows());
+
+// Pick the schema we want to read using schema evolution
+TypeDescription readSchema =
+
TypeDescription.fromString("struct>");
+// Read the row data
+VectorizedRowBatch batch = readSchema.createRowBatch();
+RecordReader rowIterator = reader.rows(reader.options()
+.schema(readSchema));
+LongColumnVector x = (LongColumnVector) batch.cols[0];
+LongColumnVector y = (LongColumnVector) batch.cols[1];
+MapColumnVector z = (MapColumnVector) batch.cols[2];
+
+/**
+ * cause the batch max size = 1024
+ * so at the row 1024 (from 0 begin,actually is row 1025)、the value is 
reset
+ * the final line is row 1499,and the map value from 2375 to 2379
+ */
+while (rowIterator.nextBatch(batch)) {
+  for (int row = 0; row < batch.size; ++row) {
+int xRow = x.isRepeating ? 0 : row;
+int yRow = y.isRepeating ? 0 : row;
+int zRow = z.isRepeating ? 0 : row;
+
+System.out.println("x: " +
+(x.noNulls || !x.isNull[xRow] ? x.vector[xRow] : null));
+System.out.println("y: " + (y.noNulls || !y.isNull[yRow] ? 
y.vector[yRow] : null));
+
+System.out.print("z: [");
+long index = z.offsets[zRow];
+for (long i = 0; i < z.lengths[zRow]; i++) {
+  final BytesColumnVector keys = (BytesColumnVector) z.keys;
+  final LongColumnVector values = (LongColumnVector) z.values;
+  String key = keys.toString((int) (index + i));
+  final long value = values.vector[(int) (index + i)];
+  System.out.print(key + ":" + value);
+  System.out.print(" ");
+}
+System.out.println("]");

Review comment:
   Although the output is a little confusing, the data looks correct to me.
   ```
   File schema: struct>
   Row count: 1500
   x: 0
   y: 0
   z: [row 0.0:0 row 0.1:1 row 0.2:2 row 0.3:3 row 0.4:4 ]
   x: 1
   y: 3
   z: [row 1.0:5 row 1.1:6 row 1.2:7 row 1.3:8 row 1.4:9 ]
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #848:
URL: https://github.com/apache/orc/pull/848#issuecomment-896489481


   The error comes here. Let me take a look at this.
   - 
https://github.com/airlift/aircompressor/blob/master/src/main/java/io/airlift/compress/snappy/UnsafeUtil.java#L52


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


krystalics commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896489312


   > Or, you can do it via Web browser at the following link. Please find and 
click `Fetch upstream`.
   > 
   > * https://github.com/krystalics/orc/tree/example-advanced-reader
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on a change in pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686477128



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.orc.examples;
+
+import java.io.IOException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.MapColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.RecordReader;
+import org.apache.orc.TypeDescription;
+
+/**
+ * This example shows how to read compound data types in ORC.
+ *
+ */
+public class AdvancedReader {
+
+  public static void main(Configuration conf,String[] args) throws IOException 
{

Review comment:
   Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on a change in pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


krystalics commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686476960



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.orc.examples;
+
+import java.io.IOException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.MapColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.RecordReader;
+import org.apache.orc.TypeDescription;
+
+/**
+ * This example shows how to read compound data types in ORC.
+ *
+ */
+public class AdvancedReader {
+
+  public static void main(Configuration conf,String[] args) throws IOException 
{

Review comment:
   done
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #848:
URL: https://github.com/apache/orc/pull/848#issuecomment-896485275


   Oh, interesting. Your PR reveals that our benchmark fails at Java 16+. This 
is really an invaluable test coverage!
   ```
   [WARN ] Problem opening checksum file: data/generated/sales/json.snappy.  
Ignoring exception: 
   java.io.EOFException
at java.base/java.io.DataInputStream.readFully(DataInputStream.java:203)
at java.base/java.io.DataInputStream.readFully(DataInputStream.java:172)
at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:151)
at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:346)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
at 
org.apache.orc.bench.core.convert.json.JsonReader.(JsonReader.java:59)
at 
org.apache.orc.bench.core.convert.GenerateVariants.createFileReader(GenerateVariants.java:90)
at 
org.apache.orc.bench.core.convert.ScanVariants.run(ScanVariants.java:88)
at org.apache.orc.bench.core.Driver.main(Driver.java:64)
   Error: Exception in thread "main" com.google.gson.JsonIOException: 
java.io.EOFException: End of input at line 1 column 1
at com.google.gson.JsonStreamParser.hasNext(JsonStreamParser.java:109)
at 
org.apache.orc.bench.core.convert.json.JsonReader.nextBatch(JsonReader.java:76)
at 
org.apache.orc.bench.core.convert.ScanVariants.run(ScanVariants.java:92)
at org.apache.orc.bench.core.Driver.main(Driver.java:64)
   Caused by: java.io.EOFException: End of input at line 1 column 1
at 
com.google.gson.stream.JsonReader.nextNonWhitespace(JsonReader.java:1377)
at 
com.google.gson.stream.JsonReader.consumeNonExecutePrefix(JsonReader.java:1514)
at com.google.gson.stream.JsonReader.doPeek(JsonReader.java:523)
at com.google.gson.stream.JsonReader.peek(JsonReader.java:414)
at com.google.gson.JsonStreamParser.hasNext(JsonStreamParser.java:105)
... 3 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on a change in pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


krystalics commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686476521



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.orc.examples;
+
+import java.io.IOException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.MapColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.RecordReader;
+import org.apache.orc.TypeDescription;
+
+/**
+ * This example shows how to read compound data types in ORC.
+ *
+ */
+public class AdvancedReader {
+
+  public static void main(Configuration conf,String[] args) throws IOException 
{

Review comment:
   it's my wrong , forget to format the code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun edited a comment on pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun edited a comment on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896484087


   Or,  you can do it via Web browser at the following link. Please find and 
click `Fetch upstream`.
   - https://github.com/krystalics/orc/tree/example-advanced-reader


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896484087


   Or,  you can do it via Web browser at the following link.
   - https://github.com/krystalics/orc/tree/example-advanced-reader


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896483712


   The branch is still far behind. Please rebase it and force-push it to remove 
`java/examples/pom.xml` from this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


krystalics commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896482903


   > In addition, the example is an executable uber jar.
   > 
   > ```
   > $ java -jar target/orc-examples-1.8.0-SNAPSHOT-uber.jar
   > ORC Java Examples
   > 
   > usage: java -jar orc-examples-*.jar [--help] [--define X=Y]  

   > 
   > Commands:
   >write - write a sample ORC file
   >read - read a sample ORC file
   >write2 - write a sample ORC file with a map
   > 
   > To get more help, provide -h to the command
   > ```
   > 
   > Please add a new command
   > 
   > * 
https://github.com/apache/orc/blob/main/java/examples/src/java/org/apache/orc/examples/Driver.java#L74-L76
   > * 
https://github.com/apache/orc/blob/main/java/examples/src/java/org/apache/orc/examples/Driver.java#L74-L76
   
   done, please check again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on a change in pull request #847: ORC-933: Add `AdvancedReader.java` example

2021-08-10 Thread GitBox


dongjoon-hyun commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686475108



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.orc.examples;
+
+import java.io.IOException;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.MapColumnVector;
+import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.RecordReader;
+import org.apache.orc.TypeDescription;
+
+/**
+ * This example shows how to read compound data types in ORC.
+ *
+ */
+public class AdvancedReader {
+
+  public static void main(Configuration conf,String[] args) throws IOException 
{

Review comment:
   nit. `Configuration conf,String[] args` -> `Configuration conf, String[] 
args`. We need a space between two params.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] williamhyun commented on pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


williamhyun commented on pull request #848:
URL: https://github.com/apache/orc/pull/848#issuecomment-896478425


   cc: @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] williamhyun opened a new pull request #848: ORC-934: Add integration tests for Java bench

2021-08-10 Thread GitBox


williamhyun opened a new pull request #848:
URL: https://github.com/apache/orc/pull/848


   
   ### What changes were proposed in this pull request?
   This PR aims to add integration tests for Java bench. 
   
   
   ### Why are the changes needed?
   To prevent further regressions. 
   
   ### How was this patch tested?
   Pass the CIs. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ORC-934) Add integration tests for Java bench

2021-08-10 Thread William Hyun (Jira)
William Hyun created ORC-934:


 Summary: Add integration tests for Java bench
 Key: ORC-934
 URL: https://issues.apache.org/jira/browse/ORC-934
 Project: ORC
  Issue Type: Improvement
  Components: Java
Affects Versions: 1.8.0
Reporter: William Hyun
Assignee: William Hyun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [orc] krystalics commented on pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


krystalics commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896476663


   > In addition, the example is an executable uber jar.
   > 
   > ```
   > $ java -jar target/orc-examples-1.8.0-SNAPSHOT-uber.jar
   > ORC Java Examples
   > 
   > usage: java -jar orc-examples-*.jar [--help] [--define X=Y]  

   > 
   > Commands:
   >write - write a sample ORC file
   >read - read a sample ORC file
   >write2 - write a sample ORC file with a map
   > 
   > To get more help, provide -h to the command
   > ```
   > 
   > Please add a new command
   > 
   > * 
https://github.com/apache/orc/blob/main/java/examples/src/java/org/apache/orc/examples/Driver.java#L74-L76
   > * 
https://github.com/apache/orc/blob/main/java/examples/src/java/org/apache/orc/examples/Driver.java#L74-L76
   
   ok,i'll do it right now, thanks a lot for review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on a change in pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


krystalics commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686468179



##
File path: java/examples/pom.xml
##
@@ -50,10 +50,12 @@
 
   org.apache.hadoop
   hadoop-common
+  compile
 
 
   org.apache.hadoop
   hadoop-hdfs
+  compile

Review comment:
   i fork it at yesterday, it changes so fast




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on a change in pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


dongjoon-hyun commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686467716



##
File path: java/examples/pom.xml
##
@@ -50,10 +50,12 @@
 
   org.apache.hadoop
   hadoop-common
+  compile
 
 
   org.apache.hadoop
   hadoop-hdfs
+  compile

Review comment:
   BTW, @krystalics . When you make a PR, you had better rebase your branch 
to the latest `main` branch.
   For example, this file is the content of the exiting `main` branch, but it 
seems that GitHub shows this because your PR branch is too old.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896474540


   Welcome and no problem at all, @krystalics !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on a change in pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


krystalics commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686466938



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,71 @@
+package org.apache.orc.examples;

Review comment:
   got it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics commented on pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


krystalics commented on pull request #847:
URL: https://github.com/apache/orc/pull/847#issuecomment-896474073


   > Thank you for making a PR, @krystalics .
   
   thanks to reple me so quickly , it's my first time to involve this project


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on a change in pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


dongjoon-hyun commented on a change in pull request #847:
URL: https://github.com/apache/orc/pull/847#discussion_r686466538



##
File path: java/examples/src/java/org/apache/orc/examples/AdvancedReader.java
##
@@ -0,0 +1,71 @@
+package org.apache.orc.examples;

Review comment:
   Apache project requires `Apache License` header always. Please check the 
other example files and match the style.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] krystalics opened a new pull request #847: ORC-933:extend the example with advanced reader in orc-example

2021-08-10 Thread GitBox


krystalics opened a new pull request #847:
URL: https://github.com/apache/orc/pull/847


   
   ### What changes were proposed in this pull request?
   
   the main branch's example module has the AdvancedWriter,but doesn't contain 
the match reader,so I complete it.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ORC-933) extend the example with advanced reader

2021-08-10 Thread L-Job (Jira)
L-Job created ORC-933:
-

 Summary: extend the example with advanced reader
 Key: ORC-933
 URL: https://issues.apache.org/jira/browse/ORC-933
 Project: ORC
  Issue Type: Improvement
Reporter: L-Job


the main branch's example module has the AdvancedWriter,but doesn't contain the 
match reader,so I complete it.

and with this related to [ORC-902|https://issues.apache.org/jira/browse/ORC-902]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [orc] dongjoon-hyun commented on pull request #846: Bump jaxb-api from 2.2.11 to 2.3.1 in /java

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #846:
URL: https://github.com/apache/orc/pull/846#issuecomment-896413956


   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dependabot[bot] opened a new pull request #846: Bump jaxb-api from 2.2.11 to 2.3.1 in /java

2021-08-10 Thread GitBox


dependabot[bot] opened a new pull request #846:
URL: https://github.com/apache/orc/pull/846


   Bumps [jaxb-api](https://github.com/javaee/jaxb-spec) from 2.2.11 to 2.3.1.
   
   Commits
   
   See full diff in https://github.com/javaee/jaxb-spec/commits/2.3.1";>compare view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=javax.xml.bind:jaxb-api&package-manager=maven&previous-version=2.2.11&new-version=2.3.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun merged pull request #842: ORC-932: Bump byte-buddy from 1.10.19 to 1.11.12

2021-08-10 Thread GitBox


dongjoon-hyun merged pull request #842:
URL: https://github.com/apache/orc/pull/842


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ORC-932) Bump byte-buddy from 1.10.19 to 1.11.12

2021-08-10 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created ORC-932:
-

 Summary: Bump byte-buddy from 1.10.19 to 1.11.12
 Key: ORC-932
 URL: https://issues.apache.org/jira/browse/ORC-932
 Project: ORC
  Issue Type: Improvement
  Components: Java
Affects Versions: 1.8.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [orc] dongjoon-hyun commented on pull request #823: ORC-886,ORC-905: Add integration tests for Java tools/examples

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #823:
URL: https://github.com/apache/orc/pull/823#issuecomment-896404207


   I landed this to branch-1.6, too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #843: ORC-929: Fix NaN error at orc-tools meta command

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #843:
URL: https://github.com/apache/orc/pull/843#issuecomment-896400879


   I landed this to branch-1.6, too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #839: ORC-926: Consolidate license header style in Java files

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #839:
URL: https://github.com/apache/orc/pull/839#issuecomment-896400756


   I landed this to branch-1.6 too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #833: ORC-921: Add an encrypted example file

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #833:
URL: https://github.com/apache/orc/pull/833#issuecomment-896400649


   I backported this to branch-1.6 too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] omalley closed pull request #716: ORC-743: Added conversion of SArg into filters to take advantage of t…

2021-08-10 Thread GitBox


omalley closed pull request #716:
URL: https://github.com/apache/orc/pull/716


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #795: Bump commons-csv from 1.8 to 1.9.0 in /java

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #795:
URL: https://github.com/apache/orc/pull/795#issuecomment-896363150


   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #752: ORC-849: Core Benchmark Cleanup

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #752:
URL: https://github.com/apache/orc/pull/752#issuecomment-896301589


   I'll backport this to branch-1.7 for Apache ORC 1.7.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun merged pull request #844: ORC-930: Ignore unsupported JSON x ZSTD combination in bench

2021-08-10 Thread GitBox


dongjoon-hyun merged pull request #844:
URL: https://github.com/apache/orc/pull/844


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #844: ORC-930: Ignore unsupported JSON x ZSTD combination in bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #844:
URL: https://github.com/apache/orc/pull/844#issuecomment-896299804


   I'll merge this, @pgaref ~ :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun merged pull request #845: ORC-931: Modify RunLengthIntegerWriterV2 code to improve readability

2021-08-10 Thread GitBox


dongjoon-hyun merged pull request #845:
URL: https://github.com/apache/orc/pull/845


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #844: ORC-930: Ignore unsupported JSON x ZSTD combination in bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #844:
URL: https://github.com/apache/orc/pull/844#issuecomment-896214316


   I added an empty commit to recover from the data loss during GitHub outage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #844: ORC-930: Ignore unsupported JSON x ZSTD combination in bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #844:
URL: https://github.com/apache/orc/pull/844#issuecomment-896192907


   I updated my branch, but it seems that GitHub outage resets the change from 
this PR.
   - https://github.com/dongjoon-hyun/orc/tree/ORC-930


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [orc] dongjoon-hyun commented on pull request #844: ORC-930: Ignore unsupported JSON x ZSTD combination in bench

2021-08-10 Thread GitBox


dongjoon-hyun commented on pull request #844:
URL: https://github.com/apache/orc/pull/844#issuecomment-896121276


   Thank you for review, @pgaref !
   For the `Scan`, it looks okay to ignore it because the file doesn't exist.
   For the `Generator`, I'll try to add a message.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ORC-931) Optimize RunLengthIntegerWriterV2 code for better readability

2021-08-10 Thread Yiqun Zhang (Jira)
Yiqun Zhang created ORC-931:
---

 Summary: Optimize RunLengthIntegerWriterV2 code for better 
readability
 Key: ORC-931
 URL: https://issues.apache.org/jira/browse/ORC-931
 Project: ORC
  Issue Type: Improvement
Reporter: Yiqun Zhang


RunLengthIntegerWriterV2.java
512-546 line
{code:java}
  if (diffBitsLH > 1) {
  for (int i = 0; i < numLiterals; i++) {
baseRedLiterals[i] = literals[i] - min;
  }
  brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 0.95);
  brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 1.0);
  if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) {
encoding = EncodingType.PATCHED_BASE;
preparePatchedBlob();
return;
  } else {
encoding = EncodingType.DIRECT;
return;
  }
} else {
  // if difference in bits between 95th percentile and 100th percentile is
  // 0, then patch length will become 0. Hence we will fallback to direct
  encoding = EncodingType.DIRECT;
  return;
}
{code}
All three conditional branch logics have been completed and the return 
statement is redundant.

691-704 line
{code:java}
  if (fixedRunLength < MIN_REPEAT) {
  variableRunLength = fixedRunLength;
  fixedRunLength = 0;
  determineEncoding();
  writeValues();
} else if (fixedRunLength >= MIN_REPEAT
&& fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
  encoding = EncodingType.SHORT_REPEAT;
  writeValues();
} else {
  encoding = EncodingType.DELTA;
  isFixedDelta = true;
  writeValues();
}
{code}
fixedRunLength >= MIN_REPEAT is redundant, the previous condition already 
ensures this.  Extract the writeValues() method to the end. It seems better for 
conditional judgements to deal only with encoding and state.

772-781 line
{code:java}
  if (fixedRunLength >= MIN_REPEAT) {
if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
  encoding = EncodingType.SHORT_REPEAT;
  writeValues();
} else {
  encoding = EncodingType.DELTA;
  isFixedDelta = true;
  writeValues();
}
  }
{code}
Ditto



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [orc] guiyanakuang opened a new pull request #845: Modify RunLengthIntegerWriterV2 code to improve readability

2021-08-10 Thread GitBox


guiyanakuang opened a new pull request #845:
URL: https://github.com/apache/orc/pull/845


   
   
   ### What changes were proposed in this pull request?
   
   RunLengthIntegerWriterV2.java
   512-546 line
   ```java
 if (diffBitsLH > 1) {
 for (int i = 0; i < numLiterals; i++) {
   baseRedLiterals[i] = literals[i] - min;
 }
 brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 
0.95);
 brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 
1.0);
 if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) 
{
   encoding = EncodingType.PATCHED_BASE;
   preparePatchedBlob();
   return;
 } else {
   encoding = EncodingType.DIRECT;
   return;
 }
   } else {
 // if difference in bits between 95th percentile and 100th percentile 
is
 // 0, then patch length will become 0. Hence we will fallback to direct
 encoding = EncodingType.DIRECT;
 return;
   }
   ```
   All three conditional branch logics have been completed and the return 
statement is redundant.
   
   691-704 line
   ```java
 if (fixedRunLength < MIN_REPEAT) {
 variableRunLength = fixedRunLength;
 fixedRunLength = 0;
 determineEncoding();
 writeValues();
   } else if (fixedRunLength >= MIN_REPEAT
   && fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
 encoding = EncodingType.SHORT_REPEAT;
 writeValues();
   } else {
 encoding = EncodingType.DELTA;
 isFixedDelta = true;
 writeValues();
   }
   ```
   fixedRunLength >= MIN_REPEAT is redundant, the previous condition already 
ensures this.  
   Extract the writeValues() method to the end. It seems better for conditional 
judgements to deal only with encoding and state.
   
   772-781 line
   ```java
 if (fixedRunLength >= MIN_REPEAT) {
   if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
 encoding = EncodingType.SHORT_REPEAT;
 writeValues();
   } else {
 encoding = EncodingType.DELTA;
 isFixedDelta = true;
 writeValues();
   }
 }
   ```
   Ditto
   
   ### Why are the changes needed?
   
   Modify code to improve readability
   
   ### How was this patch tested?
   
   Pass the CIs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@orc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [RESULT][VOTE] Should we release ORC 1.6.10rc0?

2021-08-10 Thread Panos Garefalakis
Thank you all for moving this forward!
Late +1 from my side -- verified GPG and checksum, built and ran tests for
Java and C++

Cheers,
Panagiotis

On Tue, Aug 10, 2021 at 6:01 AM Owen O'Malley 
wrote:

> With five +1's (three binding) and no -1's the vote passes. I'll publish
> the release.
>
> Thank you, Dongjoon, Kyle, William, and Gang for voting.
>
> .. Owen
>
> On Mon, Aug 9, 2021 at 7:41 PM Gang Wu  wrote:
>
> > +1.
> >
> > - Verified checksum and GPG.
> > - Built and ran unit tests for both Java and C++.
> >
> > Best,
> > Gang
> >
> > On Tue, Aug 10, 2021 at 1:31 AM Owen O'Malley 
> > wrote:
> >
> > > Thanks for the votes, Dongjoon, Kyle, and William!
> > >
> > > We need one more PMC vote.
> > >
> > > Thanks,
> > >Owen
> > >
> > > On Sat, Aug 7, 2021 at 5:12 PM William Hyun 
> wrote:
> > >
> > > > +1
> > > >
> > > > I tested locally, looks good to me.
> > > >
> > > > Thank you All,
> > > > William
> > > >
> > > > On 2021/08/05 18:11:07, Kyle Bendickson
>  > >
> > > > wrote:
> > > > > +1 for the release.
> > > > >
> > > > > I have tested ORC 1.6.10 with Iceberg (against current master
> > branch):
> > > > https://github.com/kbendick/iceberg/pull/43 <
> > > > https://github.com/kbendick/iceberg/pull/43> It passes without issue
> > or
> > > > without any required changes.
> > > > >
> > > > > I have also checked the SHA checksum and GPG signature KEYS and
> built
> > > > and ran from source as well.
> > > > >
> > > > > Thank you Owen,
> > > > > Kyle
> > > > > 
> > > > >
> > > > > Kyle Bendickson
> > > > > Software Engineer
> > > > > Apple
> > > > > ACS Data
> > > > > One Apple Park Way,
> > > > > Cupertino, CA 95014, USA
> > > > > kbendick...@apple.com
> > > > >
> > > > > This email and any attachments may be privileged and may contain
> > > > confidential information intended only for the recipient(s) named
> > above.
> > > > Any other distribution, forwarding, copying or disclosure of this
> > message
> > > > is strictly prohibited. If you have received this email in error,
> > please
> > > > notify me immediately by telephone or return email, and delete this
> > > message
> > > > from your system.
> > > > >
> > > > >
> > > > > > On Aug 4, 2021, at 10:15 PM, Dongjoon Hyun <
> > dongjoon.h...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > +1 for the release.
> > > > > >
> > > > > > Thank you, Owen.
> > > > > >
> > > > > > 1. Shasum and gpg sig checked.
> > > > > > 2. Built and ran the C++/Java unit tests from the source
> > > > > >on Apple Silicon (M1) with OpenJDK 1.8.0_302.
> > > > > >(cmake -DANALYZE_JAVA=ON ..; make package test-out)
> > > > > > 3. Tested Java tools uber jar manually.
> > > > > > 4. Tested Java example uber jar manually.
> > > > > > 5. Docker tests passed. (on the branch-1.6)
> > > > > > 6. Apache Spark integration test passed (ORC snapshot + Spark
> > > > > > 3.3.0-SNAPSHOT)
> > > > > >https://github.com/dongjoon-hyun/spark/pull/63
> > > > > >
> > > > > > I only noticed that the commit message has a typo, `Preparing for
> > > > release
> > > > > > 1.5.10.`.
> > > > > >
> > > > > >https://github.com/apache/orc/releases/tag/release-1.6.10rc0
> > > > > >
> > > > > > I don't think that's a blocker.
> > > > > >
> > > > > > Thanks,
> > > > > > Dongjoon.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 4, 2021 at 6:36 PM Dongjoon Hyun <
> > > dongjoon.h...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Thank you, Owen.
> > > > > >> I started the testing.
> > > > > >>
> > > > > >> Dongjoon.
> > > > > >>
> > > > > >> On Wed, Aug 4, 2021 at 11:44 AM Owen O'Malley <
> > > owen.omal...@gmail.com
> > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> All,
> > > > > >>>
> > > > > >>> Should we release the following artifacts as ORC 1.6.10?
> > > > > >>>
> > > > > >>> tar: http://home.apache.org/~omalley/orc-1.6.10/
> > > > > >>> tag:
> > https://github.com/apache/orc/releases/tag/release-1.6.10rc0
> > > > > >>> jiras:
> > > https://issues.apache.org/jira/projects/ORC/versions/12350446
> > > > > >>>
> > > > > >>> Thanks!
> > > > > >>>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>