hudi-bot opened a new issue, #17254:
URL: https://github.com/apache/hudi/issues/17254

   ORC tests fail on Spark 3.5 in Azure CI due to setup issues.  These tests 
are temporarily disabled.  We should fix them.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-8081
   - Type: Sub-task
   - Parent: https://issues.apache.org/jira/browse/HUDI-9113
   - Fix version(s):
     - 1.1.0
   
   
   ---
   
   
   ## Comments
   
   07/Oct/24 23:03;yihua;Error stacktrace
   {code:java}
   java.io.IOException: Problem adding row to 
file:/var/folders/60/wk8qzx310fd32b2dp7mhzvdc0000gn/T/junit1796010661885682116/orcFiles/1.orc
       at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:761)
       at 
org.apache.hudi.utilities.testutils.UtilitiesTestBase$Helpers.saveORCToDFS(UtilitiesTestBase.java:446)
       at 
org.apache.hudi.utilities.testutils.UtilitiesTestBase$Helpers.saveORCToDFS(UtilitiesTestBase.java:434)
       at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.prepareORCDFSFiles(HoodieDeltaStreamerTestBase.java:444)
       at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.prepareORCDFSFiles(HoodieDeltaStreamerTestBase.java:432)
       at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testORCDFSSource(TestHoodieDeltaStreamer.java:1799)
       at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testORCDFSSourceWithoutSchemaProviderAndNoTransformer(TestHoodieDeltaStreamer.java:2220)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
   Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.isSelectedInUse()Z
       at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:710){code}
    
   
    ;;;
   
   ---
   
   10/Oct/24 17:46;linliu;After upgrade to >= 3.x, we saw the following error:
   
   
   {code:java}
   java.lang.ClassCastException: 
org.apache.hadoop.hive.thrift.TUGIContainingTransport cannot be cast to 
org.apache.hadoop.hive.metastore.security.TUGIContainingTransport
        at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:177)
 ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:74)
 ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 [libthrift-0.9.3.jar:0.9.3]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_392-internal]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_392-internal]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_392-internal]
   {code}
   ;;;
   
   ---
   
   10/Oct/24 17:49;linliu;The reason is that the for >= 3.x, we have the two 
classes in the same jar: 
   
   
   {code:java}
   org.apache.hadoop.hive.thrift.TUGIContainingTransport,
   org.apache.hadoop.hive.metastore.security.TUGIContainingTransport
   {code}
   
   There is no easy way to exclude any of the two classes. Meanwhile, we don't 
know if there are any method differences in the classes. Therefore,
   I decided to postpone this ticket and revisit it after we finished higher 
priority tasks.
   ;;;
   
   ---
   
   10/Oct/24 18:29;yihua;Thanks for the findings.
   
   Here are the tests that are disabled due to the ORC and Hive dependency 
issues: 
    * testORCDFSSourceWithoutSchemaProviderAndNoTransformer, 
testORCDFSSourceWithSchemaProviderAndWithTransformer
    * 
   TestHoodieSnapshotExporter with ORC
   
   As long as we validate that the Hudi streamer can read ORC DFS source using 
spark-submit and HoodieSnapshotExporter can export ORC format, then we are 
good, i.e., this is a test setup issue which we can tackle later.;;;
   
   ---
   
   13/Oct/24 23:01;yihua;I'm deferring this task to Hudi 1.1.;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to