[ 
https://issues.apache.org/jira/browse/HUDI-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-2493:
--------------------------------------
    Labels: sev:critical  (was: )

> Verify removing glob pattern works w/ all key generators
> --------------------------------------------------------
>
>                 Key: HUDI-2493
>                 URL: https://issues.apache.org/jira/browse/HUDI-2493
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: sivabalan narayanan
>            Assignee: Raymond Xu
>            Priority: Major
>              Labels: sev:critical
>
> In the last release we added support to remove glob pattern. i.e. 
> while reading hudi dataset, 
> {code:java}
> spark.read.format("hudi").load(basePath+"/*/*") 
> {code}
> -> 
> {code:java}
> spark.read.format("hudi").load(basePath)
> {code}
> Suffixing with
> {code:java}
> "/*/*"
> {code}
>  is not required anymore. 
> But we need to verify if the same works for all key generators before we can 
> announce that in general this can be used. Or else we have to call out for 
> what key gens this works. and put in a fix for those which does not work.
>  
> For eg:
> I tried removing glob pattern from few key generator tests in 
> TestCOWDataSource and it failed. 
> [https://github.com/apache/hudi/blob/36be28712196ff4427c41b0aa885c7fcd7356d7f/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala#L413]
>  
> ones which worked after removing glob pattern: 
> SimpleKeyGenerator, ComplexKeyGenerator, GlobalDeleteKeyGenerator, 
> NonpartitionedKeyGenerator
> Ones which did not work:
> CustomKeyGenerator, TimestampBasedKeyGenerator
>  
> You can try it locally by removing the glob pattern for these tests.
> stacktrace for timestamp based key gen
> {code:java}
> 0    [main] WARN  org.apache.spark.util.Utils  - Your hostname, 
> Sivabalans-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 
> 10.0.0.202 instead (on interface en0)0    [main] WARN  
> org.apache.spark.util.Utils  - Your hostname, Sivabalans-MacBook-Pro.local 
> resolves to a loopback address: 127.0.0.1; using 10.0.0.202 instead (on 
> interface en0)1    [main] WARN  org.apache.spark.util.Utils  - Set 
> SPARK_LOCAL_IP if you need to bind to another address390  [main] WARN  
> org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable11317 
> [main] WARN  org.apache.hudi.metadata.HoodieBackedTableMetadata  - Metadata 
> table was not found at path 
> file:/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/junit1114662825639168138/dataset/.hoodie/metadata11515
>  [main] WARN  org.apache.hudi.metadata.HoodieBackedTableMetadata  - Metadata 
> table was not found at path 
> file:/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/junit1114662825639168138/dataset/.hoodie/metadata11840
>  [main] WARN  org.apache.spark.util.Utils  - Truncated the string 
> representation of a plan since it was too large. This behavior can be 
> adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.12319 
> [main] WARN  org.apache.hudi.testutils.HoodieClientTestHarness  - Closing 
> file-system instance used in previous test-run
> org.opentest4j.AssertionFailedError: Expected :trueActual   :false<Click to 
> see difference> at 
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> org.apache.hudi.functional.TestCOWDataSource.testSparkPartitonByWithTimestampBasedKeyGenerator(TestCOWDataSource.scala:517)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:688)
>  at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>  at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
>  at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
>  at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
>  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
>  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
>  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
>  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
>  at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
>  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
>  at 
> org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
>  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$6(TestMethodTestDescriptor.java:212)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:208)
>  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:137)
>  at 
> org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:71)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129)
>  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 
> at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84)
>  at java.util.ArrayList.forEach(ArrayList.java:1257) at 
> org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129)
>  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 
> at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84)
>  at java.util.ArrayList.forEach(ArrayList.java:1257) at 
> org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129)
>  at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) 
> at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127)
>  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126)
>  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84)
>  at 
> org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32)
>  at 
> org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
>  at 
> org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51)
>  at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
>  at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:87)
>  at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:53)
>  at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:66)
>  at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:51)
>  at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:87)
>  at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:66)
>  at 
> com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:71)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:221)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
> {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to