[jira] [Commented] (SPARK-22805) Use aliases for StorageLevel in event logs
[ https://issues.apache.org/jira/browse/SPARK-22805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16294727#comment-16294727 ] Anthony Truchet commented on SPARK-22805: - As I understand it does not break any standard use and save a lot of volume for 2.1 and 2.2. So worth keeping the ticket IMHO. > Use aliases for StorageLevel in event logs > -- > > Key: SPARK-22805 > URL: https://issues.apache.org/jira/browse/SPARK-22805 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.1.2, 2.2.1 >Reporter: Sergei Lebedev >Priority: Minor > > Fact 1: {{StorageLevel}} has a private constructor, therefore a list of > predefined levels is not extendable (by the users). > Fact 2: The format of event logs uses redundant representation for storage > levels > {code} > >>> len('{"Use Disk": true, "Use Memory": false, "Deserialized": true, > >>> "Replication": 1}') > 79 > >>> len('DISK_ONLY') > 9 > {code} > Fact 3: This leads to excessive log sizes for workloads with lots of > partitions, because every partition would have the storage level field which > is 60-70 bytes more than it should be. > Suggested quick win: use the names of the predefined levels to identify them > in the event log. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246804#comment-16246804 ] Anthony Truchet commented on SPARK-18838: - Agreed but at the same time this is "almost" a bug when you consider use of Spark at really large scale... I tried a naive cherry pick and obviously there are a huge lot of non trivial conflicts. Do you know how to get an overview of the change (and of what happened in the mean time since 2.2) to guide this conflict resolution? Or maybe you already have some insight about the best way to go for it? > High latency of event processing for large jobs > --- > > Key: SPARK-18838 > URL: https://issues.apache.org/jira/browse/SPARK-18838 > Project: Spark > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Sital Kedia >Assignee: Marcelo Vanzin > Fix For: 2.3.0 > > Attachments: SparkListernerComputeTime.xlsx, perfResults.pdf > > > Currently we are observing the issue of very high event processing delay in > driver's `ListenerBus` for large jobs with many tasks. Many critical > component of the scheduler like `ExecutorAllocationManager`, > `HeartbeatReceiver` depend on the `ListenerBus` events and this delay might > hurt the job performance significantly or even fail the job. For example, a > significant delay in receiving the `SparkListenerTaskStart` might cause > `ExecutorAllocationManager` manager to mistakenly remove an executor which is > not idle. > The problem is that the event processor in `ListenerBus` is a single thread > which loops through all the Listeners for each event and processes each event > synchronously > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala#L94. > This single threaded processor often becomes the bottleneck for large jobs. > Also, if one of the Listener is very slow, all the listeners will pay the > price of delay incurred by the slow listener. In addition to that a slow > listener can cause events to be dropped from the event queue which might be > fatal to the job. > To solve the above problems, we propose to get rid of the event queue and the > single threaded event processor. Instead each listener will have its own > dedicate single threaded executor service . When ever an event is posted, it > will be submitted to executor service of all the listeners. The Single > threaded executor service will guarantee in order processing of the events > per listener. The queue used for the executor service will be bounded to > guarantee we do not grow the memory indefinitely. The downside of this > approach is separate event queue per listener will increase the driver memory > footprint. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18838) High latency of event processing for large jobs
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234891#comment-16234891 ] Anthony Truchet edited comment on SPARK-18838 at 11/1/17 10:42 PM: --- I'm interested to work on a backport for Spark 2.2 we are using at Criteo. Any interest for an official backport or other thoughts on that ? was (Author: anthony-truchet): I'm interested to work on a backport for Spark 2.2 we are using at Criteo. Any thoughts on that ? > High latency of event processing for large jobs > --- > > Key: SPARK-18838 > URL: https://issues.apache.org/jira/browse/SPARK-18838 > Project: Spark > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Sital Kedia >Assignee: Marcelo Vanzin >Priority: Major > Fix For: 2.3.0 > > Attachments: SparkListernerComputeTime.xlsx, perfResults.pdf > > > Currently we are observing the issue of very high event processing delay in > driver's `ListenerBus` for large jobs with many tasks. Many critical > component of the scheduler like `ExecutorAllocationManager`, > `HeartbeatReceiver` depend on the `ListenerBus` events and this delay might > hurt the job performance significantly or even fail the job. For example, a > significant delay in receiving the `SparkListenerTaskStart` might cause > `ExecutorAllocationManager` manager to mistakenly remove an executor which is > not idle. > The problem is that the event processor in `ListenerBus` is a single thread > which loops through all the Listeners for each event and processes each event > synchronously > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala#L94. > This single threaded processor often becomes the bottleneck for large jobs. > Also, if one of the Listener is very slow, all the listeners will pay the > price of delay incurred by the slow listener. In addition to that a slow > listener can cause events to be dropped from the event queue which might be > fatal to the job. > To solve the above problems, we propose to get rid of the event queue and the > single threaded event processor. Instead each listener will have its own > dedicate single threaded executor service . When ever an event is posted, it > will be submitted to executor service of all the listeners. The Single > threaded executor service will guarantee in order processing of the events > per listener. The queue used for the executor service will be bounded to > guarantee we do not grow the memory indefinitely. The downside of this > approach is separate event queue per listener will increase the driver memory > footprint. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234891#comment-16234891 ] Anthony Truchet commented on SPARK-18838: - I'm interested to work on a backport for Spark 2.2 we are using at Criteo. Any thoughts on that ? > High latency of event processing for large jobs > --- > > Key: SPARK-18838 > URL: https://issues.apache.org/jira/browse/SPARK-18838 > Project: Spark > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Sital Kedia >Assignee: Marcelo Vanzin >Priority: Major > Fix For: 2.3.0 > > Attachments: SparkListernerComputeTime.xlsx, perfResults.pdf > > > Currently we are observing the issue of very high event processing delay in > driver's `ListenerBus` for large jobs with many tasks. Many critical > component of the scheduler like `ExecutorAllocationManager`, > `HeartbeatReceiver` depend on the `ListenerBus` events and this delay might > hurt the job performance significantly or even fail the job. For example, a > significant delay in receiving the `SparkListenerTaskStart` might cause > `ExecutorAllocationManager` manager to mistakenly remove an executor which is > not idle. > The problem is that the event processor in `ListenerBus` is a single thread > which loops through all the Listeners for each event and processes each event > synchronously > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala#L94. > This single threaded processor often becomes the bottleneck for large jobs. > Also, if one of the Listener is very slow, all the listeners will pay the > price of delay incurred by the slow listener. In addition to that a slow > listener can cause events to be dropped from the event queue which might be > fatal to the job. > To solve the above problems, we propose to get rid of the event queue and the > single threaded event processor. Instead each listener will have its own > dedicate single threaded executor service . When ever an event is posted, it > will be submitted to executor service of all the listeners. The Single > threaded executor service will guarantee in order processing of the events > per listener. The queue used for the executor service will be bounded to > guarantee we do not grow the memory indefinitely. The downside of this > approach is separate event queue per listener will increase the driver memory > footprint. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075600#comment-16075600 ] Anthony Truchet commented on SPARK-18838: - Hello, We (Criteo Predictive Search team with http://labs.criteo.com/about-us/) are critically running into this issue and I'm willing to work on it ASAP. Is there any update not listed here ? Is the plan described in the ticket kind of agreed upon ? I'll start diving in the code ; in the mean-time all design / return on experience / ... inputs are welcome :-) > High latency of event processing for large jobs > --- > > Key: SPARK-18838 > URL: https://issues.apache.org/jira/browse/SPARK-18838 > Project: Spark > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Sital Kedia > Attachments: perfResults.pdf, SparkListernerComputeTime.xlsx > > > Currently we are observing the issue of very high event processing delay in > driver's `ListenerBus` for large jobs with many tasks. Many critical > component of the scheduler like `ExecutorAllocationManager`, > `HeartbeatReceiver` depend on the `ListenerBus` events and this delay might > hurt the job performance significantly or even fail the job. For example, a > significant delay in receiving the `SparkListenerTaskStart` might cause > `ExecutorAllocationManager` manager to mistakenly remove an executor which is > not idle. > The problem is that the event processor in `ListenerBus` is a single thread > which loops through all the Listeners for each event and processes each event > synchronously > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala#L94. > This single threaded processor often becomes the bottleneck for large jobs. > Also, if one of the Listener is very slow, all the listeners will pay the > price of delay incurred by the slow listener. In addition to that a slow > listener can cause events to be dropped from the event queue which might be > fatal to the job. > To solve the above problems, we propose to get rid of the event queue and the > single threaded event processor. Instead each listener will have its own > dedicate single threaded executor service . When ever an event is posted, it > will be submitted to executor service of all the listeners. The Single > threaded executor service will guarantee in order processing of the events > per listener. The queue used for the executor service will be bounded to > guarantee we do not grow the memory indefinitely. The downside of this > approach is separate event queue per listener will increase the driver memory > footprint. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18612) Leaked broadcasted variable Mllib
[ https://issues.apache.org/jira/browse/SPARK-18612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702446#comment-15702446 ] Anthony Truchet commented on SPARK-18612: - See related: * https://issues.apache.org/jira/browse/SPARK-16440 * https://issues.apache.org/jira/browse/SPARK-16696 > Leaked broadcasted variable Mllib > - > > Key: SPARK-18612 > URL: https://issues.apache.org/jira/browse/SPARK-18612 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 1.6.3, 2.0.2 >Reporter: Anthony Truchet >Priority: Trivial > > Fix broadcasted variable leaks in MLlib. > For example, `bcW` in the L-BFGSS CostFun. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18612) Leaked broadcasted variable Mllib
Anthony Truchet created SPARK-18612: --- Summary: Leaked broadcasted variable Mllib Key: SPARK-18612 URL: https://issues.apache.org/jira/browse/SPARK-18612 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 2.0.2, 1.6.3 Reporter: Anthony Truchet Fix broadcasted variable leaks in MLlib. For example, `bcW` in the L-BFGSS CostFun. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18471) In treeAggregate, generate (big) zeros instead of sending them.
[ https://issues.apache.org/jira/browse/SPARK-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670795#comment-15670795 ] Anthony Truchet commented on SPARK-18471: - Sure. But in our use case we do want to aggregate on a DenseVector. Here is some context, we learn a logistic regression on very big hash space and volume of data. For each piece of data the features and the gradient will be sparse, but the aggregate will get denser and denser up to the point where it is almost fully dense (as observed on our current in house implementation). So we do want to aggregate on DenseVector, but we do not need nor want to send 100s of MB of 0 as part of the closure. > In treeAggregate, generate (big) zeros instead of sending them. > --- > > Key: SPARK-18471 > URL: https://issues.apache.org/jira/browse/SPARK-18471 > Project: Spark > Issue Type: Improvement > Components: MLlib, Spark Core >Reporter: Anthony Truchet >Priority: Minor > > When using optimization routine like LBFGS, treeAggregate curently sends the > zero vector as part of the closure. This zero can be huge (e.g. ML vectors > with millions of zeros) but can be easily generated. > Several option are possible (upcoming patches to come soon for some of them). > On is to provide a treeAggregateWithZeroGenerator method (either in core on > in MLlib) which wrap treeAggregate in an option and generate the zero if None. > Another one is to rewrite treeAggregate to wrap an underlying implementation > which use a zero generator directly. > There might be other better alternative we have not spotted... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18471) In treeAggregate, generate (big) zeros instead of sending them.
Anthony Truchet created SPARK-18471: --- Summary: In treeAggregate, generate (big) zeros instead of sending them. Key: SPARK-18471 URL: https://issues.apache.org/jira/browse/SPARK-18471 Project: Spark Issue Type: Improvement Components: MLlib, Spark Core Reporter: Anthony Truchet Priority: Minor When using optimization routine like LBFGS, treeAggregate curently sends the zero vector as part of the closure. This zero can be huge (e.g. ML vectors with millions of zeros) but can be easily generated. Several option are possible (upcoming patches to come soon for some of them). On is to provide a treeAggregateWithZeroGenerator method (either in core on in MLlib) which wrap treeAggregate in an option and generate the zero if None. Another one is to rewrite treeAggregate to wrap an underlying implementation which use a zero generator directly. There might be other better alternative we have not spotted... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387342#comment-15387342 ] Anthony Truchet commented on SPARK-16440: - Regarding the try finally: we are computing numerous learning from within a same spark context and some with vocabulary so large that they fail (yes we do try to filter out too big ones, but too big is difficult to define). So we are in a context where we do care about resource cleaning in case of error in order to enable thousands of successive learnings some of with expected to fail. As for core readability we can try to refactor the function to reduce the nesting or find a "nice" scala solution: I'll propose a patch and I'll welcome any feedback on it. > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Anthony Truchet > Fix For: 1.6.3, 2.0.1 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384118#comment-15384118 ] Anthony Truchet commented on SPARK-16440: - I will, as well as putting this is a try finally to ensure proper deletion even in case of errors. > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383967#comment-15383967 ] Anthony Truchet edited comment on SPARK-16440 at 7/19/16 11:21 AM: --- Thanks for such a quick fix [~srowen] : I was off-line for the past week that's why I couldn't submit the patch quickly enough. I would have {{destroyed}} the variable instead of {{unpersisting}} them though as the issues was memory consumption on the *driver* side: what am I missing which made you choose the later over the former ? was (Author: anthony-truchet): Thanks for such a quick fix [~srowen] : I was off-line for the past week that's why I couldn't submit the patch quickly enough. I would have {{destroy}}ed the variable instead of {{unpersist}}ing them though as the issues was memory consumption on the driver side: what am I missing which made you choose the later over the former ? > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383967#comment-15383967 ] Anthony Truchet commented on SPARK-16440: - Thanks for such a quick fix [~srowen] : I was off-line for the past week that's why I couldn't submit the patch quickly enough. I would have {{destroy}}ed the variable instead of {{unpersist}}ing them though as the issues was memory consumption on the driver side: what am I missing which made you choose the later over the former ? > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367615#comment-15367615 ] Anthony Truchet commented on SPARK-16440: - Hello Spark developers, I'm preparing a patch for this issue. This will be my first contribution to Spark. I'll strive to follow the contribution guidelines, but please do not hesitate to tell me how to do it better if required :-) > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
Anthony Truchet created SPARK-16440: --- Summary: Undeleted broadcast variables in Word2Vec causing OoM for long runs Key: SPARK-16440 URL: https://issues.apache.org/jira/browse/SPARK-16440 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 1.6.2, 1.6.1, 1.6.0, 2.0.0 Reporter: Anthony Truchet Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are never deleted nor unpersisted. This seems to cause excessive memory consumption on the driver for a job running hundreds of successive training. They are {code} val expTable = sc.broadcast(createExpTable()) val bcVocab = sc.broadcast(vocab) val bcVocabHash = sc.broadcast(vocabHash) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org