[ https://issues.apache.org/jira/browse/SPARK-16378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-16378: ------------------------------ Priority: Minor (was: Critical) This generally isn't a supported use case (restarting a context in one app), but you may find ways to trigger the clean up earlier without affecting semantics. > HiveContext doesn't release resources > ------------------------------------- > > Key: SPARK-16378 > URL: https://issues.apache.org/jira/browse/SPARK-16378 > Project: Spark > Issue Type: Bug > Components: Java API, SQL > Affects Versions: 1.6.0 > Environment: Linux Ubuntu > Reporter: Slava > Priority: Minor > > I am running this simple code: > HiveContext hiveContext = new HiveContext(new JavaSparkContext(conf)); > hiveContext.sparkContext().stop(); > Each HiveContext creation creates 100+ .dat files. > They could be counted by running "ls -l | grep dat | wc -l" and listed with > "ls -l | grep dat" commands in /proc/PID/fd directory: > lrwx------ 1 dropwizard dropwizard 64 Jul 4 21:39 891 -> > /tmp/spark-3625050e-6d18-421f-89ae-9859e9edfb9f/metastore/seg0/c650.dat > lrwx------ 1 dropwizard dropwizard 64 Jul 4 21:39 893 -> > /tmp/spark-3625050e-6d18-421f-89ae-9859e9edfb9f/metastore/seg0/c670.dat > lrwx------ 1 dropwizard dropwizard 64 Jul 4 21:39 895 -> > /tmp/spark-3625050e-6d18-421f-89ae-9859e9edfb9f/metastore/seg0/c690.dat > In my application I use "short living " context. I create it and stop > repeatedly. > It seems that stopping the SparkContext doesn't stop the HiveContext. So > these files (and it seems other resources) aren't released (deleted). > HiveContext itself doesn't have stop method. > Thus next time I create the context, it creates another 100+ files. Finally I > am running out of max open file descriptors and getting "Too many open files" > error that eventually leads to the server crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org