[jira] [Updated] (SPARK-12691) Multiple unionAll on Dataframe seems to cause repeated calculations in a "Fibonacci" manner

2016-01-06 Thread Allen Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Liang updated SPARK-12691: Affects Version/s: 1.3.0 1.3.1 1.4.0

[jira] [Assigned] (SPARK-12690) NullPointerException in UnsafeInMemorySorter.free()

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12690: Assignee: Apache Spark > NullPointerException in UnsafeInMemorySorter.free() > ---

[jira] [Created] (SPARK-12691) Multiple unionAll on Dataframe seems to cause repeated calculations in a "Fibonacci" manner

2016-01-06 Thread Allen Liang (JIRA)
Allen Liang created SPARK-12691: --- Summary: Multiple unionAll on Dataframe seems to cause repeated calculations in a "Fibonacci" manner Key: SPARK-12691 URL: https://issues.apache.org/jira/browse/SPARK-12691

[jira] [Created] (SPARK-12690) NullPointerException in UnsafeInMemorySorter.free()

2016-01-06 Thread Carson Wang (JIRA)
Carson Wang created SPARK-12690: --- Summary: NullPointerException in UnsafeInMemorySorter.free() Key: SPARK-12690 URL: https://issues.apache.org/jira/browse/SPARK-12690 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12622) spark-submit fails on executors when jar has a space in it

2016-01-06 Thread Ajesh Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087007#comment-15087007 ] Ajesh Kumar commented on SPARK-12622: - 1) I tried sbt build with name:"spark test". B

[jira] [Updated] (SPARK-12575) Grammar parity with existing SQL parser

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12575: Description: The new parser should be compatible with our existing SQL parser built using Scala pa

[jira] [Commented] (SPARK-11780) Provide type aliases in org.apache.spark.sql.types for backwards compatibility

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086989#comment-15086989 ] Apache Spark commented on SPARK-11780: -- User 'maropu' has created a pull request for

[jira] [Commented] (SPARK-12576) Enable expression parsing (used in DataFrames)

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086968#comment-15086968 ] Reynold Xin commented on SPARK-12576: - [~hvanhovell] this is next on the priority lis

[jira] [Assigned] (SPARK-12688) Spill size metric does not update for tungsten aggregation

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12688: Assignee: Apache Spark > Spill size metric does not update for tungsten aggregation >

[jira] [Commented] (SPARK-12006) GaussianMixture.train crashes if an initial model is not None

2016-01-06 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086878#comment-15086878 ] Yin Huai commented on SPARK-12006: -- Sorry. I have reverted it from 1.4, 1.5, 1.6, and ma

[jira] [Commented] (SPARK-12006) GaussianMixture.train crashes if an initial model is not None

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086874#comment-15086874 ] Apache Spark commented on SPARK-12006: -- User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-12686) Support group-by push down into data sources

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086860#comment-15086860 ] Apache Spark commented on SPARK-12686: -- User 'maropu' has created a pull request for

[jira] [Assigned] (SPARK-12686) Support group-by push down into data sources

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12686: Assignee: Apache Spark > Support group-by push down into data sources > --

[jira] [Assigned] (SPARK-12686) Support group-by push down into data sources

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12686: Assignee: (was: Apache Spark) > Support group-by push down into data sources > ---

[jira] [Commented] (SPARK-12656) Rewrite Intersect phyiscal plan using semi-join

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086856#comment-15086856 ] Apache Spark commented on SPARK-12656: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-12656) Rewrite Intersect phyiscal plan using semi-join

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12656: Assignee: Apache Spark > Rewrite Intersect phyiscal plan using semi-join > ---

[jira] [Assigned] (SPARK-12656) Rewrite Intersect phyiscal plan using semi-join

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12656: Assignee: (was: Apache Spark) > Rewrite Intersect phyiscal plan using semi-join >

[jira] [Created] (SPARK-12686) Support group-by push down into data sources

2016-01-06 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-12686: Summary: Support group-by push down into data sources Key: SPARK-12686 URL: https://issues.apache.org/jira/browse/SPARK-12686 Project: Spark Issue Ty

[jira] [Resolved] (SPARK-12678) MapPartitionsRDD should clear reference to prev RDD

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12678. - Resolution: Fixed Assignee: Guillaume Poulin Fix Version/s: 2.0.0 > MapPartitions

[jira] [Updated] (SPARK-12678) MapPartitionsRDD should clear reference to prev RDD

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12678: Fix Version/s: 1.6.1 > MapPartitionsRDD should clear reference to prev RDD > --

[jira] [Resolved] (SPARK-12673) Prepending base URI of job description is missing

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-12673. -- Resolution: Fixed Assignee: Saisai Shao > Prepending base URI of job description is missi

[jira] [Updated] (SPARK-12673) Prepending base URI of job description is missing

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-12673: - Fix Version/s: 2.0.0 1.6.1 1.5.3 > Prepending base URI of j

[jira] [Commented] (SPARK-12680) Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms

2016-01-06 Thread Sloane Simmons (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086822#comment-15086822 ] Sloane Simmons commented on SPARK-12680: Sorry, I hadn't found SPARK-12016 when I

[jira] [Commented] (SPARK-12317) Support configurate value for AUTO_BROADCASTJOIN_THRESHOLD and SHUFFLE_TARGET_POSTSHUFFLE_INPUT_SIZE with unit(e.g. kb/mb/gb) in SQLConf

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086811#comment-15086811 ] Apache Spark commented on SPARK-12317: -- User 'kevinyu98' has created a pull request

[jira] [Assigned] (SPARK-12635) More efficient (column batch) serialization for Python/R

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12635: Assignee: Apache Spark > More efficient (column batch) serialization for Python/R > --

[jira] [Commented] (SPARK-12635) More efficient (column batch) serialization for Python/R

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086800#comment-15086800 ] Apache Spark commented on SPARK-12635: -- User 'nongli' has created a pull request for

[jira] [Assigned] (SPARK-12635) More efficient (column batch) serialization for Python/R

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12635: Assignee: (was: Apache Spark) > More efficient (column batch) serialization for Python

[jira] [Resolved] (SPARK-7689) Remove TTL-based metadata cleaning (spark.cleaner.ttl)

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7689. Resolution: Fixed Assignee: Josh Rosen (was: Apache Spark) Fix Version/s: 2.0.0 > R

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Description: the log of word2vec reports trainWordsCount = -785727483 during computation over a lar

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Description: the log of word2vec reports trainWordsCount = -785727483 during computation over a lar

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Priority: Minor (was: Trivial) > word2vec trainWordsCount gets overflow > --

[jira] [Assigned] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12685: Assignee: Apache Spark > word2vec trainWordsCount gets overflow >

[jira] [Assigned] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12685: Assignee: (was: Apache Spark) > word2vec trainWordsCount gets overflow > -

[jira] [Commented] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086789#comment-15086789 ] Apache Spark commented on SPARK-12685: -- User 'hhbyyh' has created a pull request for

[jira] [Updated] (SPARK-12685) word2vec trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-12685: --- Summary: word2vec trainWordsCount gets overflow (was: word2vec logingo trainWordsCount gets overflow

[jira] [Commented] (SPARK-12685) word2vec logingo trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086786#comment-15086786 ] yuhao yang commented on SPARK-12685: Update the priority as it will affects the compu

[jira] [Created] (SPARK-12685) word2vec logingo trainWordsCount gets overflow

2016-01-06 Thread yuhao yang (JIRA)
yuhao yang created SPARK-12685: -- Summary: word2vec logingo trainWordsCount gets overflow Key: SPARK-12685 URL: https://issues.apache.org/jira/browse/SPARK-12685 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-12656) Rewrite Intersect phyiscal plan using semi-join

2016-01-06 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086782#comment-15086782 ] Xiao Li commented on SPARK-12656: - Starting it. Will submit a PR tonight. Thanks! > Rewr

[jira] [Assigned] (SPARK-12662) Add a local sort operator to DataFrame used by randomSplit

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12662: Assignee: Sameer Agarwal (was: Apache Spark) > Add a local sort operator to DataFrame use

[jira] [Commented] (SPARK-12662) Add a local sort operator to DataFrame used by randomSplit

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086770#comment-15086770 ] Apache Spark commented on SPARK-12662: -- User 'sameeragarwal' has created a pull requ

[jira] [Assigned] (SPARK-12662) Add a local sort operator to DataFrame used by randomSplit

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12662: Assignee: Apache Spark (was: Sameer Agarwal) > Add a local sort operator to DataFrame use

[jira] [Updated] (SPARK-12663) More informative error message in MLUtils.loadLibSVMFile

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12663: -- Assignee: Robert Dodier > More informative error message in MLUtils.loadLibSVMFile > --

[jira] [Resolved] (SPARK-12663) More informative error message in MLUtils.loadLibSVMFile

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-12663. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 10611 [h

[jira] [Updated] (SPARK-12016) word2vec load model can't use findSynonyms to get words

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12016: -- Fix Version/s: 1.6.1 1.5.3 > word2vec load model can't use findSynon

[jira] [Resolved] (SPARK-12640) Add benchmarks to measure the speed ups of UnsafeRowParquetReaderReader.

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12640. - Resolution: Fixed Assignee: Nong Li Fix Version/s: 2.0.0 > Add benchmarks to meas

[jira] [Commented] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

2016-01-06 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086650#comment-15086650 ] Yin Huai commented on SPARK-12682: -- I think we should also add the flag to disable savin

[jira] [Commented] (SPARK-11838) Spark SQL query fragment RDD reuse

2016-01-06 Thread Evan Chan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086649#comment-15086649 ] Evan Chan commented on SPARK-11838: --- Based on everything that is said, it seems instead

[jira] [Updated] (SPARK-12016) word2vec load model can't use findSynonyms to get words

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12016: -- Target Version/s: 1.5.3, 1.6.1, 2.0.0 > word2vec load model can't use findSynonyms to g

[jira] [Commented] (SPARK-12016) word2vec load model can't use findSynonyms to get words

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086628#comment-15086628 ] Joseph K. Bradley commented on SPARK-12016: --- This needs to be backported to 1.5

[jira] [Closed] (SPARK-12680) Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-12680. - Resolution: Duplicate > Loading Word2Vec model in pyspark gives "ValueError: too many val

[jira] [Commented] (SPARK-12680) Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086625#comment-15086625 ] Joseph K. Bradley commented on SPARK-12680: --- Oh, I forgot; this was found befor

[jira] [Resolved] (SPARK-12604) Java count(AprroxDistinct)ByKey methods return Scala Long not Java

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12604. - Resolution: Fixed Fix Version/s: 2.0.0 > Java count(AprroxDistinct)ByKey methods return Sc

[jira] [Commented] (SPARK-12680) Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086615#comment-15086615 ] Joseph K. Bradley commented on SPARK-12680: --- It fails on 1.5 for me too. Will

[jira] [Updated] (SPARK-12684) Matrix.toString should take a format for how each cell should be printed

2016-01-06 Thread Chris Roberts (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Roberts updated SPARK-12684: -- Description: Currently there is no way to control how long the printout of a cell value is in M

[jira] [Commented] (SPARK-12684) Matrix.toString should take a format for how each cell should be printed

2016-01-06 Thread Chris Roberts (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086606#comment-15086606 ] Chris Roberts commented on SPARK-12684: --- [~holdenk] We talked about this issue over

[jira] [Commented] (SPARK-12680) Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086602#comment-15086602 ] Joseph K. Bradley commented on SPARK-12680: --- I've having trouble reproducing th

[jira] [Resolved] (SPARK-12539) support writing bucketed table

2016-01-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12539. - Resolution: Fixed Assignee: Wenchen Fan (was: Apache Spark) Fix Version/s: 2.0.0

[jira] [Created] (SPARK-12684) Matrix.toString should take a format for how each cell should be printed

2016-01-06 Thread Chris Roberts (JIRA)
Chris Roberts created SPARK-12684: - Summary: Matrix.toString should take a format for how each cell should be printed Key: SPARK-12684 URL: https://issues.apache.org/jira/browse/SPARK-12684 Project: S

[jira] [Commented] (SPARK-5226) Add DBSCAN Clustering Algorithm to MLlib

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086591#comment-15086591 ] Joseph K. Bradley commented on SPARK-5226: -- Aliaksei did create a package: [http

[jira] [Commented] (SPARK-12675) Executor dies because of ClassCastException and causes timeout

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086577#comment-15086577 ] Joseph K. Bradley commented on SPARK-12675: --- That's a very large number of task

[jira] [Resolved] (SPARK-12681) Split IdentifiersParser.g to avoid single huge java source

2016-01-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12681. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 10624 [https://github.

[jira] [Commented] (SPARK-12197) Kryo's Avro Serializer add support for dynamic schemas using SchemaRepository

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086441#comment-15086441 ] Apache Spark commented on SPARK-12197: -- User 'RotemShaul' has created a pull request

[jira] [Updated] (SPARK-12683) SQL timestamp is wrong when accessed as Python datetime

2016-01-06 Thread Gerhard Fiedler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gerhard Fiedler updated SPARK-12683: Attachment: spark_bug_date.py The code from the description is attached as spark_bug_date.p

[jira] [Created] (SPARK-12683) SQL timestamp is wrong when accessed as Python datetime

2016-01-06 Thread Gerhard Fiedler (JIRA)
Gerhard Fiedler created SPARK-12683: --- Summary: SQL timestamp is wrong when accessed as Python datetime Key: SPARK-12683 URL: https://issues.apache.org/jira/browse/SPARK-12683 Project: Spark

[jira] [Created] (SPARK-12682) Hive will fail if the schema of a parquet table has a very wide schema

2016-01-06 Thread Yin Huai (JIRA)
Yin Huai created SPARK-12682: Summary: Hive will fail if the schema of a parquet table has a very wide schema Key: SPARK-12682 URL: https://issues.apache.org/jira/browse/SPARK-12682 Project: Spark

[jira] [Updated] (SPARK-7689) Remove TTL-based metadata cleaning (spark.cleaner.ttl)

2016-01-06 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-7689: -- Labels: releasenotes (was: ) > Remove TTL-based metadata cleaning (spark.cleaner.ttl) > ---

[jira] [Commented] (SPARK-12648) UDF with Option[Double] throws ClassCastException

2016-01-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086388#comment-15086388 ] Jakob Odersky commented on SPARK-12648: --- In spark-shell: {code} val df = sc.paralle

[jira] [Comment Edited] (SPARK-12648) UDF with Option[Double] throws ClassCastException

2016-01-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086388#comment-15086388 ] Jakob Odersky edited comment on SPARK-12648 at 1/6/16 10:22 PM: ---

[jira] [Updated] (SPARK-12673) Prepending base URI of job description is missing

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-12673: - Affects Version/s: 1.5.2 > Prepending base URI of job description is missing > --

[jira] [Commented] (SPARK-12672) Streaming batch ui can't be opened in jobs page in yarn mode.

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086364#comment-15086364 ] Shixiong Zhu commented on SPARK-12672: -- Reverted this one. Will merge SPARK-12673 in

[jira] [Resolved] (SPARK-12672) Streaming batch ui can't be opened in jobs page in yarn mode.

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-12672. -- Resolution: Duplicate Fix Version/s: (was: 1.6.1) (was: 1.5.3)

[jira] [Reopened] (SPARK-12672) Streaming batch ui can't be opened in jobs page in yarn mode.

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reopened SPARK-12672: -- > Streaming batch ui can't be opened in jobs page in yarn mode. > -

[jira] [Comment Edited] (SPARK-12430) Temporary folders do not get deleted after Task completes causing problems with disk space.

2016-01-06 Thread Fede Bar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086339#comment-15086339 ] Fede Bar edited comment on SPARK-12430 at 1/6/16 9:48 PM: -- Thank

[jira] [Assigned] (SPARK-12681) Split IdentifiersParser.g to avoid single huge java source

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12681: Assignee: Davies Liu (was: Apache Spark) > Split IdentifiersParser.g to avoid single huge

[jira] [Commented] (SPARK-12681) Split IdentifiersParser.g to avoid single huge java source

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086345#comment-15086345 ] Apache Spark commented on SPARK-12681: -- User 'davies' has created a pull request for

[jira] [Assigned] (SPARK-12681) Split IdentifiersParser.g to avoid single huge java source

2016-01-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-12681: Assignee: Apache Spark (was: Davies Liu) > Split IdentifiersParser.g to avoid single huge

[jira] [Updated] (SPARK-12655) GraphX does not unpersist RDDs

2016-01-06 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-12655: Affects Version/s: 1.6.0 Description: Looks like Graph does not clean all

[jira] [Commented] (SPARK-12430) Temporary folders do not get deleted after Task completes causing problems with disk space.

2016-01-06 Thread Fede Bar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086339#comment-15086339 ] Fede Bar commented on SPARK-12430: -- Thanks for the follow up, I assume the key point her

[jira] [Assigned] (SPARK-12681) Split IdentifiersParser.g to avoid single huge java source

2016-01-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-12681: -- Assignee: Davies Liu > Split IdentifiersParser.g to avoid single huge java source > --

[jira] [Created] (SPARK-12681) Split IdentifiersParser.g to avoid single huge java source

2016-01-06 Thread Davies Liu (JIRA)
Davies Liu created SPARK-12681: -- Summary: Split IdentifiersParser.g to avoid single huge java source Key: SPARK-12681 URL: https://issues.apache.org/jira/browse/SPARK-12681 Project: Spark Issue

[jira] [Resolved] (SPARK-12672) Streaming batch ui can't be opened in jobs page in yarn mode.

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-12672. -- Resolution: Fixed Assignee: SaintBacchus Fix Version/s: 2.0.0

[jira] [Updated] (SPARK-12672) Streaming batch ui can't be opened in jobs page in yarn mode.

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-12672: - Affects Version/s: (was: 1.6.1) 1.5.2 > Streaming batch ui can't be op

[jira] [Commented] (SPARK-9844) File appender race condition during SparkWorker shutdown

2016-01-06 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086241#comment-15086241 ] Yin Huai commented on SPARK-9844: - I think no one is working on it. Feel free to send out

[jira] [Created] (SPARK-12680) Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms

2016-01-06 Thread Sloane Simmons (JIRA)
Sloane Simmons created SPARK-12680: -- Summary: Loading Word2Vec model in pyspark gives "ValueError: too many values to unpack" in findSynonyms Key: SPARK-12680 URL: https://issues.apache.org/jira/browse/SPARK-126

[jira] [Commented] (SPARK-12679) Collect method return empty rows.

2016-01-06 Thread Sasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086204#comment-15086204 ] Sasi commented on SPARK-12679: -- Scenario that I can explains is: 1) Aerospike table is

[jira] [Commented] (SPARK-9844) File appender race condition during SparkWorker shutdown

2016-01-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086202#comment-15086202 ] Bryan Cutler commented on SPARK-9844: - I came across this error recently too as of 1.6

[jira] [Resolved] (SPARK-12679) Collect method return empty rows.

2016-01-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12679. --- Resolution: Invalid I can't make out what you're describing here. For example, the result is correct

[jira] [Created] (SPARK-12679) Collect method return empty rows.

2016-01-06 Thread Sasi (JIRA)
Sasi created SPARK-12679: Summary: Collect method return empty rows. Key: SPARK-12679 URL: https://issues.apache.org/jira/browse/SPARK-12679 Project: Spark Issue Type: Bug Components: SQL

[jira] [Updated] (SPARK-12617) socket descriptor leak killing streaming app

2016-01-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-12617: - Affects Version/s: 1.6.0 > socket descriptor leak killing streaming app > ---

[jira] [Resolved] (SPARK-12368) Better doc for the binary classification evaluator' metricName

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-12368. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 10328 [h

[jira] [Resolved] (SPARK-12006) GaussianMixture.train crashes if an initial model is not None

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-12006. --- Resolution: Fixed Fix Version/s: 1.4.2 1.6.1

[jira] [Updated] (SPARK-12006) GaussianMixture.train crashes if an initial model is not None

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12006: -- Shepherd: Joseph K. Bradley > GaussianMixture.train crashes if an initial model is not

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086173#comment-15086173 ] Marcelo Vanzin commented on SPARK-12650: Can you try the env variables I mention

[jira] [Updated] (SPARK-12006) GaussianMixture.train crashes if an initial model is not None

2016-01-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12006: -- Target Version/s: 1.4.2, 1.5.3, 1.6.1, 2.0.0 > GaussianMixture.train crashes if an init

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-06 Thread John Vines (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086167#comment-15086167 ] John Vines commented on SPARK-12650: I do care about knowing when the spark job is fi

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086148#comment-15086148 ] Marcelo Vanzin commented on SPARK-12650: Do you need the launcher process around

[jira] [Updated] (SPARK-12678) MapPartitionsRDD should clear reference to prev RDD

2016-01-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12678: -- Summary: MapPartitionsRDD should clear reference to prev RDD (was: MapPartitionsRDD) > MapPartitionsR

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-06 Thread John Vines (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086138#comment-15086138 ] John Vines commented on SPARK-12650: I'm launching the spark job from inside an App M

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-06 Thread John Vines (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086135#comment-15086135 ] John Vines commented on SPARK-12650: {code}[root@datanode1-systemtest-john-1 /]# java

[jira] [Commented] (SPARK-12650) No means to specify Xmx settings for SparkSubmit in yarn-cluster mode

2016-01-06 Thread John Vines (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086133#comment-15086133 ] John Vines commented on SPARK-12650: In the test example I was using, I set driver an

  1   2   3   >