[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8713 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user GavinGavinNo1 commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-141414230 @marmbrus Well, we both know that we can have multiple contexts. The difference is that it can't support continuous creating contexts. No matter how much size my permgen is, it'll lead to memory leak and cause too many jdbc connections error. Another thing you said about different metastores, I think a certain environment normally have a certain version of metastore. Iâm sure you have brilliant idea for denying my opinion, not only for what you have expressed. Otherwise, adding a parameter to control can deal with both problems. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user GavinGavinNo1 commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-141073796 @marmbrus Thanks a lot. I'm so sorry I didn't make myself clear. I mean I'm not familiar with submitting an issue or contributing to spark. What you suggest I have considered in fact, however I can neither push forward restructuring our app nor wait for stable spark 1.5. Anyway, Spark won't adapt to our app. But I still wander if it'll be a function to support multi HiveContext in one JVM, which I think more flexable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-141166255 You can have multiple contexts, you just have to increase the size of your permgen (or run Java 8). The problem with this change is it makes things less flexible since you would not longer be able to connect to multiple different metastores from the same JVM. Given that mind closing this issue? I'll also add that Spark 1.5 was released last week and we'll be releasing Spark 1.5.1 shortly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-140835592 I would suggest increasing the size of your perm gen, and/or restructuring your app to avoid creating multiple HiveContexts. Spark 1.5 adds the ability to do dynamic allocation in standalone mode. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user GavinGavinNo1 commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-140621464 @marmbrus Sorry to disturb again. Could you please give me a reply? It's my first try. Maybe I need some advice. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-139465728 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-139670064 Another reason for the isolation is the ability to connect to multiple metastores. Since hive uses global static state, new classloaders is likely the only way to accomplish this. Why are you trying to create more than one HiveContext in a JVM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10529][SQL]When creating multiple HiveC...
Github user GavinGavinNo1 commented on the pull request: https://github.com/apache/spark/pull/8713#issuecomment-139702061 Thank you much for your comment. I think I haven't got what you mean for the ability to connect to multiple metastores.One HiveContext can only connect to one metastore, right? Or you mean creating multiple HiveContext to connect to multiple metastores with one SparkContext in one JVM? If so, it'll lead to the same JVM OOM problem in theory. We use spark 1.3.1 formerly. You know it isn't supported for dynamic allocation in standalone mode. We have several apps and each one launches timely tasks using HiveContext. Due to the limit of hardware resources, we must stop SparkContext to release CPU and memory resources when a task is done. When Spark 1.4.1 comes out, it brings many new features and we want to switch to this version. However, problems mentioned in my issue make a lot of trouble to us. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org