[jira] [Commented] (SPARK-16163) Statistics of logical plan is super slow on large query

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347022#comment-15347022 ] Sean Owen commented on SPARK-16163: --- [~davies] these need to be resolved for 2.0.1 not 2.0.0

[jira] [Updated] (SPARK-16163) Statistics of logical plan is super slow on large query

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16163: -- Fix Version/s: (was: 2.0.0) 2.0.1 > Statistics of logical plan is super slow on

[jira] [Created] (SPARK-16173) Can't join describe() of DataFrame in Scala 2.10

2016-06-23 Thread Davies Liu (JIRA)
Davies Liu created SPARK-16173: -- Summary: Can't join describe() of DataFrame in Scala 2.10 Key: SPARK-16173 URL: https://issues.apache.org/jira/browse/SPARK-16173 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-13723) YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true

2016-06-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-13723. --- Resolution: Fixed Fix Version/s: 2.0.0 > YARN - Change behavior of --num-executors

[jira] [Updated] (SPARK-13723) YARN - Change behavior of --num-executors when spark.dynamicAllocation.enabled true

2016-06-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-13723: -- Assignee: Ryan Blue > YARN - Change behavior of --num-executors when >

[jira] [Updated] (SPARK-16143) Group survival analysis methods in generated doc

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16143: -- Assignee: Junyang Qian > Group survival analysis methods in generated doc >

[jira] [Assigned] (SPARK-16142) Group naive Bayes methods in generated doc

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-16142: - Assignee: Xiangrui Meng > Group naive Bayes methods in generated doc >

[jira] [Resolved] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-06-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-15725. --- Resolution: Fixed Fix Version/s: 2.0.0 > Dynamic allocation hangs YARN app when

[jira] [Commented] (SPARK-15230) Back quoted column with dot in it fails when running distinct on dataframe

2016-06-23 Thread Bo Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346981#comment-15346981 ] Bo Meng commented on SPARK-15230: - Can anyone update the "Component/s" of this JIRA? It should belongs to

[jira] [Updated] (SPARK-15725) Dynamic allocation hangs YARN app when executors time out

2016-06-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-15725: -- Assignee: Ryan Blue > Dynamic allocation hangs YARN app when executors time out >

[jira] [Resolved] (SPARK-16163) Statistics of logical plan is super slow on large query

2016-06-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16163. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13871

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread Alex Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346956#comment-15346956 ] Alex Jiang commented on SPARK-13288: [~jfc...@us.ibm.com] Did you use Kafka createStream or

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread Alex Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346951#comment-15346951 ] Alex Jiang commented on SPARK-13288: [~roberto_hashi...@hotmail.com] Thanks for your confirmation! >

[jira] [Updated] (SPARK-16130) model loading backward compatibility for ml.classfication.LogisticRegression

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16130: -- Assignee: yuhao yang > model loading backward compatibility for

[jira] [Created] (SPARK-16172) SQL Context's

2016-06-23 Thread Scott Viteri (JIRA)
Scott Viteri created SPARK-16172: Summary: SQL Context's Key: SPARK-16172 URL: https://issues.apache.org/jira/browse/SPARK-16172 Project: Spark Issue Type: Bug Components: SQL

[jira] [Updated] (SPARK-16130) model loading backward compatibility for ml.classfication.LogisticRegression

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16130: -- Fix Version/s: (was: 2.0.0) 2.1.0 2.0.1 > model

[jira] [Resolved] (SPARK-16130) model loading backward compatibility for ml.classfication.LogisticRegression

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-16130. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13841

[jira] [Resolved] (SPARK-16116) ConsoleSink should not require checkpointLocation

2016-06-23 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-16116. -- Resolution: Fixed Fix Version/s: 2.0.0 > ConsoleSink should not require

[jira] [Updated] (SPARK-16116) ConsoleSink should not require checkpointLocation

2016-06-23 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-16116: - Affects Version/s: 2.0.0 > ConsoleSink should not require checkpointLocation >

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread roberto hashioka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346803#comment-15346803 ] roberto hashioka commented on SPARK-13288: -- Yep, I tried it and I didn't see any memory usage

[jira] [Commented] (SPARK-16140) Group k-means method in generated doc

2016-06-23 Thread Xin Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346776#comment-15346776 ] Xin Ren commented on SPARK-16140: - OK, I'll target to finish it this weekend. Thanks for the tips, I'll

[jira] [Commented] (SPARK-12945) ERROR LiveListenerBus: Listener JobProgressListener threw an exception

2016-06-23 Thread DG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346774#comment-15346774 ] DG commented on SPARK-12945: Observing precisely the same issue on a smaller dataset (12 gigabytes). Spark

[jira] [Commented] (SPARK-16140) Group k-means method in generated doc

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346758#comment-15346758 ] Xiangrui Meng commented on SPARK-16140: --- Btw, please make minimal changes and do not move the code

[jira] [Commented] (SPARK-16140) Group k-means method in generated doc

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346752#comment-15346752 ] Xiangrui Meng commented on SPARK-16140: --- Thanks! Note that this is time sensitive for the release.

[jira] [Updated] (SPARK-16140) Group k-means method in generated doc

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16140: -- Assignee: Xin Ren > Group k-means method in generated doc >

[jira] [Commented] (SPARK-16158) Support pluggable dynamic allocation heuristics

2016-06-23 Thread Nezih Yigitbasi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346744#comment-15346744 ] Nezih Yigitbasi commented on SPARK-16158: - Thanks [~sowen] for your input, I understand your

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread Alex Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346736#comment-15346736 ] Alex Jiang commented on SPARK-13288: [~roberto_hashi...@hotmail.com] Did you get chance to try the

[jira] [Resolved] (SPARK-16088) Update setJobGroup, clearJobGroup, cancelJobGroup SparkR API to not require sc

2016-06-23 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman resolved SPARK-16088. --- Resolution: Fixed Fix Version/s: 2.0.1 Issue resolved by pull request

[jira] [Updated] (SPARK-16088) Update setJobGroup, clearJobGroup, cancelJobGroup SparkR API to not require sc

2016-06-23 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman updated SPARK-16088: -- Assignee: Felix Cheung > Update setJobGroup, clearJobGroup, cancelJobGroup

[jira] [Deleted] (SPARK-16171) Filter UDFs in StringIndexer shouldn't throw exception

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng deleted SPARK-16171: -- > Filter UDFs in StringIndexer shouldn't throw exception >

[jira] [Commented] (SPARK-16055) sparkR.init() can not load sparkPackages when executing an R file

2016-06-23 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346708#comment-15346708 ] Shivaram Venkataraman commented on SPARK-16055: --- Thanks [~KrishnaKalyan3] -- Feel free to

[jira] [Updated] (SPARK-16171) Filter UDFs in StringIndexer shouldn't throw exception

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16171: -- Description: [~cmccubbin] reported a bug when he used StringIndexer in an ML pipeline with

[jira] [Updated] (SPARK-16171) Filter UDFs in StringIndexer shouldn't throw exception

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16171: -- Component/s: (was: SQL) MLlib > Filter UDFs in StringIndexer shouldn't

[jira] [Created] (SPARK-16171) Filter UDFs in StringIndexer shouldn't throw exception

2016-06-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-16171: - Summary: Filter UDFs in StringIndexer shouldn't throw exception Key: SPARK-16171 URL: https://issues.apache.org/jira/browse/SPARK-16171 Project: Spark

[jira] [Commented] (SPARK-16105) PCA Reverse Transformer

2016-06-23 Thread Stefan Panayotov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346676#comment-15346676 ] Stefan Panayotov commented on SPARK-16105: -- Well, what we currently do is: - save the PCA

[jira] [Resolved] (SPARK-16154) Update spark.ml and spark.mllib package docs

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-16154. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by

[jira] [Commented] (SPARK-16105) PCA Reverse Transformer

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346582#comment-15346582 ] Sean Owen commented on SPARK-16105: --- No, I understand that fine. I'm asking whether you actually need

[jira] [Commented] (SPARK-16105) PCA Reverse Transformer

2016-06-23 Thread Stefan Panayotov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346570#comment-15346570 ] Stefan Panayotov commented on SPARK-16105: -- I understand that the 'reverse' operation is a

[jira] [Comment Edited] (SPARK-16168) Spark sql can not read ORC table

2016-06-23 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346487#comment-15346487 ] Jeff Zhang edited comment on SPARK-16168 at 6/23/16 2:10 PM: - I don't think

[jira] [Commented] (SPARK-16168) Spark sql can not read ORC table

2016-06-23 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346487#comment-15346487 ] Jeff Zhang commented on SPARK-16168: I don't think it is spark issue, it is more likely your query

[jira] [Updated] (SPARK-16138) YarnAllocator tries to cancel executor requests when we have none

2016-06-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-16138: -- Assignee: Peter Ableda (was: Apache Spark) > YarnAllocator tries to cancel executor requests

[jira] [Resolved] (SPARK-16138) YarnAllocator tries to cancel executor requests when we have none

2016-06-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-16138. --- Resolution: Fixed Fix Version/s: 2.1.0 > YarnAllocator tries to cancel executor

[jira] [Commented] (SPARK-16146) Spark application failed by Yarn preempting

2016-06-23 Thread Nick Peterson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346436#comment-15346436 ] Nick Peterson commented on SPARK-16146: --- This is preemption in the yarn sense, not the cloud sense.

[jira] [Commented] (SPARK-16170) Throw error when row is not schema-compatible

2016-06-23 Thread Federico Ponzi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346420#comment-15346420 ] Federico Ponzi commented on SPARK-16170: Hi, and thanks for the response. I've setted this as a

[jira] [Updated] (SPARK-16170) Throw error when row is not schema-compatible

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16170: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) I don't think that's a bug.

[jira] [Created] (SPARK-16170) Throw error when row is not schema-compatible

2016-06-23 Thread Federico Ponzi (JIRA)
Federico Ponzi created SPARK-16170: -- Summary: Throw error when row is not schema-compatible Key: SPARK-16170 URL: https://issues.apache.org/jira/browse/SPARK-16170 Project: Spark Issue

[jira] [Commented] (SPARK-16055) sparkR.init() can not load sparkPackages when executing an R file

2016-06-23 Thread Krishna Kalyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346305#comment-15346305 ] Krishna Kalyan commented on SPARK-16055: Hi [~shivaram], I can work on this issue. Could please

[jira] [Updated] (SPARK-13709) Spark unable to decode Avro when partitioned

2016-06-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-13709: --- Assignee: Cheng Lian > Spark unable to decode Avro when partitioned >

[jira] [Commented] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346274#comment-15346274 ] Sean Owen commented on SPARK-16169: --- If you're saving more data each time, this would make sense. It's

[jira] [Commented] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346267#comment-15346267 ] Manish Kumar commented on SPARK-16169: -- Hi [~srowen] I am not saying it is taking 5 minutes longer

[jira] [Comment Edited] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346267#comment-15346267 ] Manish Kumar edited comment on SPARK-16169 at 6/23/16 10:58 AM: Hi

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Commented] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346250#comment-15346250 ] Sean Owen commented on SPARK-16169: --- It's hard to help without knowing what you're saving, how, to

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Attachment: Spark-UI.png > Saving Intermediate dataframe increasing processing time upto 5

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is written in scala trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When a spark application is (written in scala) trying to save intermediate

[jira] [Updated] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Kumar updated SPARK-16169: - Description: When, a spark application written in scala trying to save intermediate dataframe,

[jira] [Created] (SPARK-16169) Saving Intermediate dataframe increasing processing time upto 5 times.

2016-06-23 Thread Manish Kumar (JIRA)
Manish Kumar created SPARK-16169: Summary: Saving Intermediate dataframe increasing processing time upto 5 times. Key: SPARK-16169 URL: https://issues.apache.org/jira/browse/SPARK-16169 Project:

[jira] [Resolved] (SPARK-8884) 1-sample Anderson-Darling Goodness-of-Fit test

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-8884. -- Resolution: Won't Fix Target Version/s: (was: ) > 1-sample Anderson-Darling Goodness-of-Fit

[jira] [Resolved] (SPARK-12697) Allow adding new streams without stopping Spark streaming context

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-12697. --- Resolution: Won't Fix > Allow adding new streams without stopping Spark streaming context >

[jira] [Resolved] (SPARK-10465) Shortest Path between two vertices, using distance and results carries shortest path and distance

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10465. --- Resolution: Won't Fix > Shortest Path between two vertices, using distance and results carries >

[jira] [Commented] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-23 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346209#comment-15346209 ] Jiang Xingbo commented on SPARK-14172: -- In collectProjectsAndFilters function, currently we only

[jira] [Resolved] (SPARK-15660) Update RDD `variance/stdev` description and add popVariance/popStdev

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15660. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13403

[jira] [Updated] (SPARK-15660) Update RDD `variance/stdev` description and add popVariance/popStdev

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15660: -- Assignee: Dongjoon Hyun > Update RDD `variance/stdev` description and add popVariance/popStdev >

[jira] [Created] (SPARK-16168) Spark sql can not read ORC table

2016-06-23 Thread AnfengYuan (JIRA)
AnfengYuan created SPARK-16168: -- Summary: Spark sql can not read ORC table Key: SPARK-16168 URL: https://issues.apache.org/jira/browse/SPARK-16168 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-16167) RowEncoder should preserve array/map type nullability.

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346072#comment-15346072 ] Apache Spark commented on SPARK-16167: -- User 'ueshin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16167) RowEncoder should preserve array/map type nullability.

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16167: Assignee: (was: Apache Spark) > RowEncoder should preserve array/map type

[jira] [Assigned] (SPARK-16167) RowEncoder should preserve array/map type nullability.

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16167: Assignee: Apache Spark > RowEncoder should preserve array/map type nullability. >

[jira] [Created] (SPARK-16167) RowEncoder should preserve array/map type nullability.

2016-06-23 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-16167: - Summary: RowEncoder should preserve array/map type nullability. Key: SPARK-16167 URL: https://issues.apache.org/jira/browse/SPARK-16167 Project: Spark

[jira] [Updated] (SPARK-16024) add tests for table creation with column comment

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16024: -- Fix Version/s: (was: 2.0.0) 2.0.1 > add tests for table creation with column

[jira] [Updated] (SPARK-15672) R programming guide update

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15672: -- Fix Version/s: (was: 2.0.0) 2.0.1 > R programming guide update >

[jira] [Commented] (SPARK-15345) SparkSession's conf doesn't take effect when there's already an existing SparkContext

2016-06-23 Thread Piotr Milanowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345999#comment-15345999 ] Piotr Milanowski commented on SPARK-15345: -- Please note, that the issue I raised in my last

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread Alex Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345987#comment-15345987 ] Alex Jiang commented on SPARK-13288: Similar issue seen on our development. > [1.6.0] Memory leak in

[jira] [Commented] (SPARK-16146) Spark application failed by Yarn preempting

2016-06-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345984#comment-15345984 ] Saisai Shao commented on SPARK-16146: - I see, that could explain why executor get lost so frequently

[jira] [Commented] (SPARK-16146) Spark application failed by Yarn preempting

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345981#comment-15345981 ] Sean Owen commented on SPARK-16146: --- I think this is preemption in the sense that the cloud provider

[jira] [Commented] (SPARK-16164) Filter pushdown should keep the ordering in the logical plan

2016-06-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345977#comment-15345977 ] Dongjoon Hyun commented on SPARK-16164: --- It's my pleasure. :) > Filter pushdown should keep the

[jira] [Assigned] (SPARK-16164) Filter pushdown should keep the ordering in the logical plan

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16164: Assignee: Apache Spark > Filter pushdown should keep the ordering in the logical plan >

[jira] [Assigned] (SPARK-16164) Filter pushdown should keep the ordering in the logical plan

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16164: Assignee: (was: Apache Spark) > Filter pushdown should keep the ordering in the

[jira] [Commented] (SPARK-16164) Filter pushdown should keep the ordering in the logical plan

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345971#comment-15345971 ] Apache Spark commented on SPARK-16164: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-16164) Filter pushdown should keep the ordering in the logical plan

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345958#comment-15345958 ] Xiangrui Meng commented on SPARK-16164: --- Thanks for identifying the root cause so quickly! >

[jira] [Created] (SPARK-16166) Correctly honor off heap memory usage in web ui and log display

2016-06-23 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-16166: --- Summary: Correctly honor off heap memory usage in web ui and log display Key: SPARK-16166 URL: https://issues.apache.org/jira/browse/SPARK-16166 Project: Spark

[jira] [Commented] (SPARK-9478) Add class weights to Random Forest

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345946#comment-15345946 ] Xiangrui Meng commented on SPARK-9478: -- Sorry for being late in the discussion! Instance weight

[jira] [Commented] (SPARK-16164) Filter pushdown should keep the ordering in the logical plan

2016-06-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345945#comment-15345945 ] Dongjoon Hyun commented on SPARK-16164: --- Hi, [~mengxr]. The root cause seems to be

[jira] [Commented] (SPARK-16146) Spark application failed by Yarn preempting

2016-06-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345942#comment-15345942 ] Saisai Shao commented on SPARK-16146: - If it is due to preemption, AM log will show the details of

[jira] [Commented] (SPARK-16163) Statistics of logical plan is super slow on large query

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345934#comment-15345934 ] Apache Spark commented on SPARK-16163: -- User 'davies' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16163) Statistics of logical plan is super slow on large query

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16163: Assignee: Apache Spark > Statistics of logical plan is super slow on large query >

[jira] [Assigned] (SPARK-16163) Statistics of logical plan is super slow on large query

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16163: Assignee: (was: Apache Spark) > Statistics of logical plan is super slow on large

[jira] [Updated] (SPARK-15958) Make initial buffer size for the Sorter configurable

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15958: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Make initial buffer size for

[jira] [Resolved] (SPARK-13804) Spark SQL's DataFrame.count() Major Divergent (Non-Linear) Performance Slowdown going from 4million rows to 16+ million rows

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13804. --- Resolution: Not A Problem > Spark SQL's DataFrame.count() Major Divergent (Non-Linear) Performance

[jira] [Commented] (SPARK-16065) Throw a exception "java.lang.ClassNotFoundException" when run the spark-submit

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345915#comment-15345915 ] Sean Owen commented on SPARK-16065: --- Either you don't really have the class on the classpath that you

[jira] [Updated] (SPARK-16131) initialize internal logger lazily instead of manual null check

2016-06-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16131: -- Fix Version/s: (was: 2.0.0) 2.0.1 > initialize internal logger lazily instead

[jira] [Updated] (SPARK-9478) Add class weights to Random Forest

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-9478: - Target Version/s: 2.1.0 (was: ) > Add class weights to Random Forest >

[jira] [Assigned] (SPARK-16165) Fix the update logic for InMemoryTableScanExec.readBatches accumulator

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16165: Assignee: Apache Spark > Fix the update logic for InMemoryTableScanExec.readBatches

[jira] [Assigned] (SPARK-16165) Fix the update logic for InMemoryTableScanExec.readBatches accumulator

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16165: Assignee: (was: Apache Spark) > Fix the update logic for

[jira] [Commented] (SPARK-16165) Fix the update logic for InMemoryTableScanExec.readBatches accumulator

2016-06-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345902#comment-15345902 ] Apache Spark commented on SPARK-16165: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-5992) Locality Sensitive Hashing (LSH) for MLlib

2016-06-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345890#comment-15345890 ] Xiangrui Meng commented on SPARK-5992: -- FYI, I had an offline discussion with Kelvin Chu and Erran Li

[jira] [Created] (SPARK-16165) Fix the update logic for InMemoryTableScanExec.readBatches accumulator

2016-06-23 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-16165: - Summary: Fix the update logic for InMemoryTableScanExec.readBatches accumulator Key: SPARK-16165 URL: https://issues.apache.org/jira/browse/SPARK-16165 Project:

<    1   2   3   >