[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-854398116 @kimberlyamandalu Are you able to try out the suggestions from @umehrot2 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-854398116 @kimberlyamandalu Are you able to try out the suggestions from @umehrot2 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-824527320 @umehrot2 Are you able to jump in and help @kimberlyamandalu here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-814684654 @kimberlyamandalu This is unknown territory for me as well. Let me loop in some AWS experts here. @umehrot2 Do you have any idea what the timeouts may be related to ? To summarize this ticket's context, @kimberlyamandalu is running into the following exception when using timeline server ``` 21/04/07 00:59:13 ERROR FileSystemViewHandler: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-813861043 Yes, this is expected since you are probably using TimelineServer since that is enabled by default. The timeline server is a server running on the spark driver that services these requests to the executors. We need to understand what the underlying exception is, specifically `21/04/04 18:28:35 WARN ExceptionMapper: Uncaught exception` - what is the uncaught exception here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-812339166 @kimberlyamandalu Yes, you should be able to switch off your metadata table without any side-effect. Although, if you want to later turn on the metadata table, you will need to delete data under `basepath/.hoodie/metadata`. Once the metadata folder is empty, you can toggle the metadata back on and things will work fine. There is a change to do this automatically in master now but for your case to debug the issue, I would recommend just deleting the metadata folder. Sure, I can help you with making the changes so you can build it and deploy the custom build. Before turning the metadata table `off`, you should make the following changes, deploy your custom build and get the logs because once you turn the metadata table `off` and then `on` the problem may not be reproducible. 1. Checkout the 0.7.0 release tag by doing `git checkout tags/version 0.7.0` 2. Add a bunch of logs to the following method -> https://github.com/apache/hudi/blob/release-0.7.0/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/FileSystemViewHandler.java#L342. We ideally want to know which line of code is throwing the runtime exception and why. So, add logs before and after at any method invocation inside will be helpful. 2. Run the following command : mvn clean package -DskipTests 3. New bundle jars will be available under package/*. Use these new bundle jars to deploy and run to collect logs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-810763459 @kimberlyamandalu Can you try turning off the metadata table in hoodie to get your pipeline unblocked ? ``` hoodie.metadata.enable=false ``` This looks like an exception in the metadata table. Without any more logs, it's hard to debug what may be going on. If you are OK to deploy a custom build, we can work on adding more logs to help surface the underlying issue. https://github.com/apache/hudi/blob/release-0.7.0/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/FileSystemViewHandler.java#L378 this is where the exception is coming from. If we can add more logs to this function to see why a runtime exception is being thrown, it may help to find the root cause. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org