[ https://issues.apache.org/jira/browse/HUDI-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan updated HUDI-2493: -------------------------------------- Labels: sev:critical (was: ) > Verify removing glob pattern works w/ all key generators > -------------------------------------------------------- > > Key: HUDI-2493 > URL: https://issues.apache.org/jira/browse/HUDI-2493 > Project: Apache Hudi > Issue Type: Improvement > Components: Spark Integration > Reporter: sivabalan narayanan > Assignee: Raymond Xu > Priority: Major > Labels: sev:critical > > In the last release we added support to remove glob pattern. i.e. > while reading hudi dataset, > {code:java} > spark.read.format("hudi").load(basePath+"/*/*") > {code} > -> > {code:java} > spark.read.format("hudi").load(basePath) > {code} > Suffixing with > {code:java} > "/*/*" > {code} > is not required anymore. > But we need to verify if the same works for all key generators before we can > announce that in general this can be used. Or else we have to call out for > what key gens this works. and put in a fix for those which does not work. > > For eg: > I tried removing glob pattern from few key generator tests in > TestCOWDataSource and it failed. > [https://github.com/apache/hudi/blob/36be28712196ff4427c41b0aa885c7fcd7356d7f/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala#L413] > > ones which worked after removing glob pattern: > SimpleKeyGenerator, ComplexKeyGenerator, GlobalDeleteKeyGenerator, > NonpartitionedKeyGenerator > Ones which did not work: > CustomKeyGenerator, TimestampBasedKeyGenerator > > You can try it locally by removing the glob pattern for these tests. > stacktrace for timestamp based key gen > {code:java} > 0 [main] WARN org.apache.spark.util.Utils - Your hostname, > Sivabalans-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using > 10.0.0.202 instead (on interface en0)0 [main] WARN > org.apache.spark.util.Utils - Your hostname, Sivabalans-MacBook-Pro.local > resolves to a loopback address: 127.0.0.1; using 10.0.0.202 instead (on > interface en0)1 [main] WARN org.apache.spark.util.Utils - Set > SPARK_LOCAL_IP if you need to bind to another address390 [main] WARN > org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable11317 > [main] WARN org.apache.hudi.metadata.HoodieBackedTableMetadata - Metadata > table was not found at path > file:/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/junit1114662825639168138/dataset/.hoodie/metadata11515 > [main] WARN org.apache.hudi.metadata.HoodieBackedTableMetadata - Metadata > table was not found at path > file:/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/junit1114662825639168138/dataset/.hoodie/metadata11840 > [main] WARN org.apache.spark.util.Utils - Truncated the string > representation of a plan since it was too large. This behavior can be > adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.12319 > [main] WARN org.apache.hudi.testutils.HoodieClientTestHarness - Closing > file-system instance used in previous test-run > org.opentest4j.AssertionFailedError: Expected :trueActual :false<Click to > see difference> at > org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at > org.apache.hudi.functional.TestCOWDataSource.testSparkPartitonByWithTimestampBasedKeyGenerator(TestCOWDataSource.scala:517) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:688) > at > org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) > at > org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149) > at > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140) > at > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84) > at > org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115) > at > org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) > at > org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104) > at > org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$6(TestMethodTestDescriptor.java:212) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:208) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:137) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:71) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) > at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) > at java.util.ArrayList.forEach(ArrayList.java:1257) at > org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) > at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) > at java.util.ArrayList.forEach(ArrayList.java:1257) at > org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) > at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) > at > org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32) > at > org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57) > at > org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:87) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:53) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:66) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:51) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:87) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:66) > at > com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:71) > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:221) > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) > {code} > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)