[jira] [Closed] (SPARK-4680) Add support for no-op compression

2014-12-02 Thread Victor Tso (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victor Tso closed SPARK-4680. - Resolution: Not a Problem spark.broadcast.compress spark.rdd.compress spark.shuffle.compress These

[jira] [Commented] (SPARK-4644) Implement skewed join

2014-12-02 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231161#comment-14231161 ] Lianhui Wang commented on SPARK-4644: - Shixiong Zhu yes, i agree with you. i will take

[jira] [Commented] (SPARK-3638) Commons HTTP client dependency conflict in extras/kinesis-asl module

2014-12-02 Thread A.K.M. Ashrafuzzaman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231168#comment-14231168 ] A.K.M. Ashrafuzzaman commented on SPARK-3638: - [~aniket] Yes you are right. I

[jira] [Comment Edited] (SPARK-4644) Implement skewed join

2014-12-02 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231161#comment-14231161 ] Lianhui Wang edited comment on SPARK-4644 at 12/2/14 8:48 AM: --

[jira] [Comment Edited] (SPARK-4644) Implement skewed join

2014-12-02 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231161#comment-14231161 ] Lianhui Wang edited comment on SPARK-4644 at 12/2/14 8:48 AM: --

[jira] [Created] (SPARK-4691) code optimization for judgement

2014-12-02 Thread maji2014 (JIRA)
maji2014 created SPARK-4691: --- Summary: code optimization for judgement Key: SPARK-4691 URL: https://issues.apache.org/jira/browse/SPARK-4691 Project: Spark Issue Type: Bug Reporter:

[jira] [Commented] (SPARK-4156) Add expectation maximization for Gaussian mixture models to MLLib clustering

2014-12-02 Thread Meethu Mathew (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231226#comment-14231226 ] Meethu Mathew commented on SPARK-4156: -- We had run the GMM code on two public

[jira] [Commented] (SPARK-4691) code optimization for judgement

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231243#comment-14231243 ] Apache Spark commented on SPARK-4691: - User 'maji2014' has created a pull request for

[jira] [Commented] (SPARK-4685) Update JavaDoc settings to include spark.ml and all spark.mllib subpackages in the right sections

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231275#comment-14231275 ] Apache Spark commented on SPARK-4685: - User 'Lewuathe' has created a pull request for

[jira] [Created] (SPARK-4692) Support ! boolean logic operator like NOT

2014-12-02 Thread YanTang Zhai (JIRA)
YanTang Zhai created SPARK-4692: --- Summary: Support ! boolean logic operator like NOT Key: SPARK-4692 URL: https://issues.apache.org/jira/browse/SPARK-4692 Project: Spark Issue Type:

[jira] [Commented] (SPARK-4692) Support ! boolean logic operator like NOT

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231308#comment-14231308 ] Apache Spark commented on SPARK-4692: - User 'YanTangZhai' has created a pull request

[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS

2014-12-02 Thread Valeriy Avanesov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231375#comment-14231375 ] Valeriy Avanesov edited comment on SPARK-2426 at 12/2/14 11:47 AM:

[jira] [Commented] (SPARK-2426) Quadratic Minimization for MLlib ALS

2014-12-02 Thread Valeriy Avanesov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231375#comment-14231375 ] Valeriy Avanesov commented on SPARK-2426: - I'm not sure if I understand your

[jira] [Comment Edited] (SPARK-2426) Quadratic Minimization for MLlib ALS

2014-12-02 Thread Valeriy Avanesov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231375#comment-14231375 ] Valeriy Avanesov edited comment on SPARK-2426 at 12/2/14 11:47 AM:

[jira] [Created] (SPARK-4693) PruningPredicates may be wrong if predicates contains an empty AttributeSet() references

2014-12-02 Thread YanTang Zhai (JIRA)
Project: Spark Issue Type: Bug Components: SQL Reporter: YanTang Zhai Priority: Minor The sql select * from spark_test::for_test where abs(20141202) is not null has predicates=List(IS NOT NULL HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFAbs

[jira] [Commented] (SPARK-4693) PruningPredicates may be wrong if predicates contains an empty AttributeSet() references

2014-12-02 Thread Apache Spark (JIRA)
: https://issues.apache.org/jira/browse/SPARK-4693 Project: Spark Issue Type: Bug Components: SQL Reporter: YanTang Zhai Priority: Minor The sql select * from spark_test::for_test where abs(20141202) is not null has predicates=List(IS NOT NULL

[jira] [Created] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-02 Thread SaintBacchus (JIRA)
SaintBacchus created SPARK-4694: --- Summary: Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode Key: SPARK-4694 URL: https://issues.apache.org/jira/browse/SPARK-4694

[jira] [Updated] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-02 Thread SaintBacchus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SaintBacchus updated SPARK-4694: Component/s: YARN Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in

[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-02 Thread SaintBacchus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231440#comment-14231440 ] SaintBacchus commented on SPARK-4694: - The reason was that Yarn had reported the

[jira] [Commented] (SPARK-4156) Add expectation maximization for Gaussian mixture models to MLLib clustering

2014-12-02 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231474#comment-14231474 ] Travis Galoppo commented on SPARK-4156: --- Ok, I looked into this. This is the result

[jira] [Commented] (SPARK-2710) Build SchemaRDD from a JdbcRDD with MetaData (no hard-coded case class)

2014-12-02 Thread Joerg Schad (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231481#comment-14231481 ] Joerg Schad commented on SPARK-2710: Hi, is there any documentation about the Data

[jira] [Created] (SPARK-4695) Get result using executeCollect in spark sql

2014-12-02 Thread wangfei (JIRA)
wangfei created SPARK-4695: -- Summary: Get result using executeCollect in spark sql Key: SPARK-4695 URL: https://issues.apache.org/jira/browse/SPARK-4695 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-4695) Get result using executeCollect in spark sql

2014-12-02 Thread wangfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangfei updated SPARK-4695: --- Issue Type: Improvement (was: Bug) Get result using executeCollect in spark sql

[jira] [Commented] (SPARK-4695) Get result using executeCollect in spark sql

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231560#comment-14231560 ] Apache Spark commented on SPARK-4695: - User 'scwf' has created a pull request for this

[jira] [Created] (SPARK-4696) Yarn: spark.driver.extra* variables not applied consistently to yarn client mode AM

2014-12-02 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-4696: Summary: Yarn: spark.driver.extra* variables not applied consistently to yarn client mode AM Key: SPARK-4696 URL: https://issues.apache.org/jira/browse/SPARK-4696

[jira] [Created] (SPARK-4697) System properties should override environment variables

2014-12-02 Thread WangTaoTheTonic (JIRA)
WangTaoTheTonic created SPARK-4697: -- Summary: System properties should override environment variables Key: SPARK-4697 URL: https://issues.apache.org/jira/browse/SPARK-4697 Project: Spark

[jira] [Commented] (SPARK-4697) System properties should override environment variables

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231589#comment-14231589 ] Apache Spark commented on SPARK-4697: - User 'WangTaoTheTonic' has created a pull

[jira] [Created] (SPARK-4698) Data-locality aware Partitioners

2014-12-02 Thread Kevin Mader (JIRA)
Kevin Mader created SPARK-4698: -- Summary: Data-locality aware Partitioners Key: SPARK-4698 URL: https://issues.apache.org/jira/browse/SPARK-4698 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-4560) Lambda deserialization error

2014-12-02 Thread Alexis Seigneurin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis Seigneurin updated SPARK-4560: - Affects Version/s: 1.1.1 Lambda deserialization error

[jira] [Commented] (SPARK-4156) Add expectation maximization for Gaussian mixture models to MLLib clustering

2014-12-02 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231682#comment-14231682 ] Travis Galoppo commented on SPARK-4156: --- I do have a bug in the DenseGmmEM example

[jira] [Commented] (SPARK-3553) Spark Streaming app streams files that have already been streamed in an endless loop

2014-12-02 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231697#comment-14231697 ] Micael Capitão commented on SPARK-3553: --- I'm having that same issue running Spark

[jira] [Comment Edited] (SPARK-3553) Spark Streaming app streams files that have already been streamed in an endless loop

2014-12-02 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231697#comment-14231697 ] Micael Capitão edited comment on SPARK-3553 at 12/2/14 4:28 PM:

[jira] [Comment Edited] (SPARK-3553) Spark Streaming app streams files that have already been streamed in an endless loop

2014-12-02 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231697#comment-14231697 ] Micael Capitão edited comment on SPARK-3553 at 12/2/14 4:30 PM:

[jira] [Commented] (SPARK-3523) GraphX graph partitioning strategy

2014-12-02 Thread Larry Xiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231734#comment-14231734 ] Larry Xiao commented on SPARK-3523: --- Hi [~lianhuiwang], that's great! As you can see, to

[jira] [Commented] (SPARK-3553) Spark Streaming app streams files that have already been streamed in an endless loop

2014-12-02 Thread Ezequiel Bella (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231737#comment-14231737 ] Ezequiel Bella commented on SPARK-3553: --- Please see if this post works for you,

[jira] [Comment Edited] (SPARK-3523) GraphX graph partitioning strategy

2014-12-02 Thread Larry Xiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231734#comment-14231734 ] Larry Xiao edited comment on SPARK-3523 at 12/2/14 4:50 PM: Hi

[jira] [Created] (SPARK-4699) Make caseSensitive configurable in Analyzer.scala

2014-12-02 Thread Jacky Li (JIRA)
Jacky Li created SPARK-4699: --- Summary: Make caseSensitive configurable in Analyzer.scala Key: SPARK-4699 URL: https://issues.apache.org/jira/browse/SPARK-4699 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-4686) Link to allowed master URLs is broken in configuration documentation

2014-12-02 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-4686. --- Resolution: Fixed Fix Version/s: 1.1.2 1.2.0 Link to allowed

[jira] [Commented] (SPARK-4699) Make caseSensitive configurable in Analyzer.scala

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231797#comment-14231797 ] Apache Spark commented on SPARK-4699: - User 'jackylk' has created a pull request for

[jira] [Commented] (SPARK-3553) Spark Streaming app streams files that have already been streamed in an endless loop

2014-12-02 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231815#comment-14231815 ] Micael Capitão commented on SPARK-3553: --- I've already seen that post. It didn't work

[jira] [Updated] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4672: --- Fix Version/s: 1.2.0 Cut off the super long serialization chain in GraphX to avoid the

[jira] [Commented] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231891#comment-14231891 ] Reynold Xin commented on SPARK-4672: cc [~ankurdave] Can you take a look at this

[jira] [Created] (SPARK-4700) Add Http support to Spark Thrift server

2014-12-02 Thread Judy Nash (JIRA)
Judy Nash created SPARK-4700: Summary: Add Http support to Spark Thrift server Key: SPARK-4700 URL: https://issues.apache.org/jira/browse/SPARK-4700 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-3641) Correctly populate SparkPlan.currentContext

2014-12-02 Thread Kapil Malik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231964#comment-14231964 ] Kapil Malik commented on SPARK-3641: Hi all, Is this expected to be fixed with Spark

[jira] [Commented] (SPARK-4616) SPARK_CONF_DIR is not effective in spark-submit

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232030#comment-14232030 ] Apache Spark commented on SPARK-4616: - User 'brennonyork' has created a pull request

[jira] [Created] (SPARK-4701) Typo in sbt/sbt

2014-12-02 Thread Masayoshi TSUZUKI (JIRA)
Masayoshi TSUZUKI created SPARK-4701: Summary: Typo in sbt/sbt Key: SPARK-4701 URL: https://issues.apache.org/jira/browse/SPARK-4701 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-4701) Typo in sbt/sbt

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232037#comment-14232037 ] Apache Spark commented on SPARK-4701: - User 'tsudukim' has created a pull request for

[jira] [Updated] (SPARK-4536) Add sqrt and abs to Spark SQL DSL

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4536: Assignee: Kousuke Saruta Add sqrt and abs to Spark SQL DSL

[jira] [Resolved] (SPARK-4536) Add sqrt and abs to Spark SQL DSL

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4536. - Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 (was:

[jira] [Updated] (SPARK-3641) Correctly populate SparkPlan.currentContext

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3641: Fix Version/s: 1.2.0 Correctly populate SparkPlan.currentContext

[jira] [Created] (SPARK-4702) Querying non-existent partition produces exception in v1.2.0-rc1

2014-12-02 Thread Yana Kadiyska (JIRA)
Yana Kadiyska created SPARK-4702: Summary: Querying non-existent partition produces exception in v1.2.0-rc1 Key: SPARK-4702 URL: https://issues.apache.org/jira/browse/SPARK-4702 Project: Spark

[jira] [Commented] (SPARK-3641) Correctly populate SparkPlan.currentContext

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232080#comment-14232080 ] Michael Armbrust commented on SPARK-3641: - This has been fixed for a while and

[jira] [Resolved] (SPARK-4663) close() function is not surrounded by finally in ParquetTableOperations.scala

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4663. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3526

[jira] [Created] (SPARK-4703) Windows path resolution is incorrect

2014-12-02 Thread Andrew Or (JIRA)
Andrew Or created SPARK-4703: Summary: Windows path resolution is incorrect Key: SPARK-4703 URL: https://issues.apache.org/jira/browse/SPARK-4703 Project: Spark Issue Type: Bug

[jira] [Closed] (SPARK-4703) Windows path resolution is incorrect

2014-12-02 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4703. Resolution: Not a Problem This is a mistake on my part. The path file:/C:/path/my.jar is actually a valid

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-12-02 Thread Anson Abraham (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232181#comment-14232181 ] Anson Abraham commented on SPARK-1867: -- interesting. so it's possible spark-shell

[jira] [Created] (SPARK-4704) SparkSubmitDriverBootstrap doesn't flush output

2014-12-02 Thread Stephen Haberman (JIRA)
Stephen Haberman created SPARK-4704: --- Summary: SparkSubmitDriverBootstrap doesn't flush output Key: SPARK-4704 URL: https://issues.apache.org/jira/browse/SPARK-4704 Project: Spark Issue

[jira] [Issue Comment Deleted] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4672: --- Comment: was deleted (was: User 'JerryLead' has created a pull request for this issue:

[jira] [Issue Comment Deleted] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4672: --- Comment: was deleted (was: User 'JerryLead' has created a pull request for this issue:

[jira] [Issue Comment Deleted] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4672: --- Comment: was deleted (was: User 'JerryLead' has created a pull request for this issue:

[jira] [Commented] (SPARK-4688) Have a single shared network timeout in Spark

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232219#comment-14232219 ] Apache Spark commented on SPARK-4688: - User 'varunsaxena' has created a pull request

[jira] [Commented] (SPARK-1127) Add saveAsHBase to PairRDDFunctions

2014-12-02 Thread Ted Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1423#comment-1423 ] Ted Yu commented on SPARK-1127: --- According to Reynold, First half of the external data

[jira] [Resolved] (SPARK-4676) JavaSchemaRDD.schema may throw NullType MatchError if sql has null

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4676. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3538

[jira] [Resolved] (SPARK-4593) sum(1/0) would produce a very large number

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4593. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3443

[jira] [Resolved] (SPARK-4670) bitwise NOT has a wrong `toString` output

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4670. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3528

[jira] [Resolved] (SPARK-4695) Get result using executeCollect in spark sql

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4695. - Resolution: Fixed Fix Version/s: (was: 1.3.0) 1.2.0 Issue

[jira] [Created] (SPARK-4705) Driver retries in yarn-cluster mode always fail if event logging is enabled

2014-12-02 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-4705: - Summary: Driver retries in yarn-cluster mode always fail if event logging is enabled Key: SPARK-4705 URL: https://issues.apache.org/jira/browse/SPARK-4705 Project:

[jira] [Created] (SPARK-4706) Remove FakeParquetSerDe

2014-12-02 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-4706: --- Summary: Remove FakeParquetSerDe Key: SPARK-4706 URL: https://issues.apache.org/jira/browse/SPARK-4706 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-3431) Parallelize execution of tests

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232309#comment-14232309 ] Apache Spark commented on SPARK-3431: - User 'nchammas' has created a pull request for

[jira] [Created] (SPARK-4707) Reliable Kafka Receiver can lose data if the block generator fails to store data

2014-12-02 Thread Hari Shreedharan (JIRA)
Hari Shreedharan created SPARK-4707: --- Summary: Reliable Kafka Receiver can lose data if the block generator fails to store data Key: SPARK-4707 URL: https://issues.apache.org/jira/browse/SPARK-4707

[jira] [Updated] (SPARK-4525) MesosSchedulerBackend.resourceOffers cannot decline unused offers from acceptedOffers

2014-12-02 Thread Jongyoul Lee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jongyoul Lee updated SPARK-4525: Shepherd: (was: Andrew Or) MesosSchedulerBackend.resourceOffers cannot decline unused offers

[jira] [Updated] (SPARK-4575) Documentation for the pipeline features

2014-12-02 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4575: - Assignee: Joseph K. Bradley (was: Xiangrui Meng) Documentation for the pipeline features

[jira] [Created] (SPARK-4708) k-mean runs two/three times faster with dense/sparse sample

2014-12-02 Thread DB Tsai (JIRA)
DB Tsai created SPARK-4708: -- Summary: k-mean runs two/three times faster with dense/sparse sample Key: SPARK-4708 URL: https://issues.apache.org/jira/browse/SPARK-4708 Project: Spark Issue Type:

[jira] [Updated] (SPARK-4708) Make k-mean runs two/three times faster with dense/sparse sample

2014-12-02 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-4708: --- Component/s: MLlib Make k-mean runs two/three times faster with dense/sparse sample

[jira] [Updated] (SPARK-4708) Make k-mean runs two/three times faster with dense/sparse sample

2014-12-02 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-4708: --- Summary: Make k-mean runs two/three times faster with dense/sparse sample (was: k-mean runs two/three times

[jira] [Commented] (SPARK-4708) Make k-mean runs two/three times faster with dense/sparse sample

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232428#comment-14232428 ] Apache Spark commented on SPARK-4708: - User 'dbtsai' has created a pull request for

[jira] [Commented] (SPARK-2710) Build SchemaRDD from a JdbcRDD with MetaData (no hard-coded case class)

2014-12-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232431#comment-14232431 ] Michael Armbrust commented on SPARK-2710: - I'd suggest looking at the test cases

[jira] [Created] (SPARK-4709) Spark SQL support for Parquet with timestamp type field

2014-12-02 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-4709: --- Summary: Spark SQL support for Parquet with timestamp type field Key: SPARK-4709 URL: https://issues.apache.org/jira/browse/SPARK-4709 Project: Spark Issue

[jira] [Updated] (SPARK-4709) Spark SQL support error reading Parquet with timestamp type field

2014-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-4709: Summary: Spark SQL support error reading Parquet with timestamp type field (was: Spark SQL support

[jira] [Updated] (SPARK-4707) Reliable Kafka Receiver can lose data if the block generator fails to store data

2014-12-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4707: --- Priority: Critical (was: Major) Reliable Kafka Receiver can lose data if the block

[jira] [Commented] (SPARK-874) Have a --wait flag in ./sbin/stop-all.sh that polls until Worker's are finished

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232504#comment-14232504 ] Apache Spark commented on SPARK-874: User 'jbencook' has created a pull request for

[jira] [Commented] (SPARK-4644) Implement skewed join

2014-12-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232522#comment-14232522 ] Shixiong Zhu commented on SPARK-4644: - Looks `groupByKey` is really different from

[jira] [Commented] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Jason Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232524#comment-14232524 ] Jason Dai commented on SPARK-4672: -- We ran into the same issue, and this is a nice

[jira] [Commented] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232527#comment-14232527 ] Reynold Xin commented on SPARK-4672: Yea it makes sense to remove all the function

[jira] [Commented] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Jason Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232529#comment-14232529 ] Jason Dai commented on SPARK-4672: -- [~rxin] what exactly do you mean by remove all the

[jira] [Created] (SPARK-4710) Fix MLlib compilation warnings

2014-12-02 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-4710: Summary: Fix MLlib compilation warnings Key: SPARK-4710 URL: https://issues.apache.org/jira/browse/SPARK-4710 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-4710) Fix MLlib compilation warnings

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232534#comment-14232534 ] Apache Spark commented on SPARK-4710: - User 'jkbradley' has created a pull request for

[jira] [Commented] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232533#comment-14232533 ] Reynold Xin commented on SPARK-4672: Ok I admit I wasn't reading your comment too

[jira] [Created] (SPARK-4711) MLlib optimization: docs should suggest how to choose optimizer

2014-12-02 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-4711: Summary: MLlib optimization: docs should suggest how to choose optimizer Key: SPARK-4711 URL: https://issues.apache.org/jira/browse/SPARK-4711 Project: Spark

[jira] [Commented] (SPARK-4711) MLlib optimization: docs should suggest how to choose optimizer

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232536#comment-14232536 ] Apache Spark commented on SPARK-4711: - User 'jkbradley' has created a pull request for

[jira] [Created] (SPARK-4712) uploading jar when set spark.yarn.jar

2014-12-02 Thread Hong Shen (JIRA)
Hong Shen created SPARK-4712: Summary: uploading jar when set spark.yarn.jar Key: SPARK-4712 URL: https://issues.apache.org/jira/browse/SPARK-4712 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-4712) uploading jar when set spark.yarn.jar

2014-12-02 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232543#comment-14232543 ] Hong Shen commented on SPARK-4712: -- The reason is the HDFS is HA mode, ant spark client

[jira] [Closed] (SPARK-4712) uploading jar when set spark.yarn.jar

2014-12-02 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen closed SPARK-4712. Resolution: Won't Fix uploading jar when set spark.yarn.jar --

[jira] [Commented] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Jason Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232549#comment-14232549 ] Jason Dai commented on SPARK-4672: -- I can see two possible ways to fix this: # Define

[jira] [Commented] (SPARK-3910) ./python/pyspark/mllib/classification.py doctests fails with module name pollution

2014-12-02 Thread Yu Ishikawa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232598#comment-14232598 ] Yu Ishikawa commented on SPARK-3910: I had had the same problem like Tomohiko.

[jira] [Commented] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2014-12-02 Thread SUMANTH B B N (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232604#comment-14232604 ] SUMANTH B B N commented on SPARK-3717: -- [~josephkb]i completed my implementation and

[jira] [Created] (SPARK-4713) SchemaRDD.unpersist() should not raise exception if it is not cached.

2014-12-02 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4713: Summary: SchemaRDD.unpersist() should not raise exception if it is not cached. Key: SPARK-4713 URL: https://issues.apache.org/jira/browse/SPARK-4713 Project: Spark

[jira] [Commented] (SPARK-4713) SchemaRDD.unpersist() should not raise exception if it is not cached.

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232608#comment-14232608 ] Apache Spark commented on SPARK-4713: - User 'chenghao-intel' has created a pull

[jira] [Commented] (SPARK-4397) Reorganize 'implicit's to improve the API convenience

2014-12-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232632#comment-14232632 ] Apache Spark commented on SPARK-4397: - User 'zsxwing' has created a pull request for

[jira] [Comment Edited] (SPARK-4672) Cut off the super long serialization chain in GraphX to avoid the StackOverflow error

2014-12-02 Thread Jason Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232549#comment-14232549 ] Jason Dai edited comment on SPARK-4672 at 12/3/14 6:01 AM: --- I

  1   2   >