yesh385 opened a new pull request, #4813:
URL: https://github.com/apache/hive/pull/4813

   This PR fixes a flaky test called `TestLazyBinaryColumnarSerDe.testSerDe` 
which can be found 
[here](https://github.com/apache/hive/blob/master/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java).
   
   1. How was this test identified as flaky?
   
   This test was identifies as flaky by using an open-source research tool 
named [NonDex](https://github.com/TestingResearchIllinois/NonDex) which is 
responsible for finding and diagnosing non-deterministic runtime exceptions in 
Java programs.
   
   2. What does this test do? 
   
   This test is responsible for testing serialization and deserialization of 
data using a specific implementation called LazyBinaryColumnarSerDe. This test 
is used to ensure that the serialization and deserialization of an object of 
type `OuterStruct` works correctly. 
   
   3. Why this test is flaky?
   
   This test is flaky as there is an order mismatch between the different 
fields of the object inspector `oi` in the serialization process.
   
   The error occurs here:
   
https://github.com/apache/hive/blob/ed98e1cd01c937e202666bb5e7fbd00e45088164/serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java#L104
   
   Specifically, there is a mismatch between the field `f` and the field object 
inspector `foi` during serialization causing the field to be serialized using 
the incorrect field object inspector which results in a 
`java.lang.ClassCastException`.
   
https://github.com/apache/hive/blob/ed98e1cd01c937e202666bb5e7fbd00e45088164/serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java#L119
   
   4. How I fixed this test?
   
   This PR fixes this error by sorting the fields of the object inspector `oi` 
based on the `slot` property.
   
   In the fix, we are using the Arrays.sort method to sort the array f using a 
custom comparator.  In this comparator, we try to get a reference to the 
private field named `slot` in the class `Field` using reflection. We then set 
it to be accessible and retrieve the value of the field `slot` for the current 
object and return it as an integer. In the event that, there is an error with 
the reflection or if the `slot` field does not exist, we catch it and print the 
stack trace. In case of an exception, we return 0 to ensure that we provide a 
fallback value that allows the program to continue executing.
   
   You can run the following command to run the test using NonDex tool:
   ```
   mvn edu.illinois:nondex-maven-plugin:2.1.1:nondex -pl serde 
-Dtest=org.apache.hadoop.hive.serde2.columnar.TestLazyBinaryColumnarSerDe#testSerDe
   ```
   
   (Optional) You can also run the following command to run the test:
   ```
   mvn -pl serde test 
-Dtest=org.apache.hadoop.hive.serde2.columnar.TestLazyBinaryColumnarSerDe#testSerDe
   ```
   
   Test Environment:
   ```
   java version "1.8.0_202"
   Apache Maven 3.6.3
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to