[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638555#comment-15638555 ] Cody Koeninger commented on SPARK-18258: Sure, added, let me know if I'm missing something or can

[jira] [Updated] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-18258: --- Description: Transactional "exactly-once" semantics for output require storing an offset

[jira] [Commented] (SPARK-12757) Use reference counting to prevent blocks from being evicted during reads

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638546#comment-15638546 ] Felix Cheung commented on SPARK-12757: -- I'm seeing the same with latest master running a pipeline

[jira] [Created] (SPARK-18285) approxQuantile in R support multi-column

2016-11-04 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-18285: Summary: approxQuantile in R support multi-column Key: SPARK-18285 URL: https://issues.apache.org/jira/browse/SPARK-18285 Project: Spark Issue Type:

[jira] [Commented] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2016-11-04 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638489#comment-15638489 ] zhengruifeng commented on SPARK-13677: -- It is shown in sklearn's doc here

[jira] [Commented] (SPARK-14047) GBT improvement umbrella

2016-11-04 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638475#comment-15638475 ] zhengruifeng commented on SPARK-14047: -- I personally think SPARK-15581 may be a improvement about

[jira] [Comment Edited] (SPARK-14047) GBT improvement umbrella

2016-11-04 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638475#comment-15638475 ] zhengruifeng edited comment on SPARK-14047 at 11/5/16 3:21 AM: --- I

[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638470#comment-15638470 ] Reynold Xin commented on SPARK-18258: - This makes sense. It's just extra information you want to be

[jira] [Commented] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2016-11-04 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638459#comment-15638459 ] zhengruifeng commented on SPARK-13677: -- Since mllib is in maintenance status. If this feature will

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2016-11-04 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Summary: Support Tree-Based Feature Transformation for ML (was: Support Tree-Based Feature

[jira] [Updated] (SPARK-14174) Accelerate KMeans via Mini-Batch EM

2016-11-04 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14174: - Description: The MiniBatchKMeans is a variant of the KMeans algorithm which uses mini-batches

[jira] [Created] (SPARK-18284) Scheme of DataFrame generated from RDD is diffrent between master and 2.0

2016-11-04 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-18284: Summary: Scheme of DataFrame generated from RDD is diffrent between master and 2.0 Key: SPARK-18284 URL: https://issues.apache.org/jira/browse/SPARK-18284

[jira] [Resolved] (SPARK-18256) Improve performance of event log replay in HistoryServer based on profiler results

2016-11-04 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-18256. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15756

[jira] [Commented] (SPARK-17748) One-pass algorithm for linear regression with L1 and elastic-net penalties

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638392#comment-15638392 ] Apache Spark commented on SPARK-17748: -- User 'jkbradley' has created a pull request for this issue:

[jira] [Commented] (SPARK-15581) MLlib 2.1 Roadmap

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638280#comment-15638280 ] Felix Cheung commented on SPARK-15581: -- This is a great next step if we could get more concrete on

[jira] [Commented] (SPARK-10523) SparkR formula syntax to turn strings/factors into numerics

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638254#comment-15638254 ] Felix Cheung commented on SPARK-10523: -- Is this still an issue? As Yanbo says, we now support string

[jira] [Assigned] (SPARK-18283) Add a test to make sure the default starting offset is latest

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18283: Assignee: Tathagata Das (was: Apache Spark) > Add a test to make sure the default

[jira] [Commented] (SPARK-18283) Add a test to make sure the default starting offset is latest

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638228#comment-15638228 ] Apache Spark commented on SPARK-18283: -- User 'tdas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18283) Add a test to make sure the default starting offset is latest

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18283: Assignee: Apache Spark (was: Tathagata Das) > Add a test to make sure the default

[jira] [Created] (SPARK-18283) Add a test to make sure the default starting offset is latest

2016-11-04 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-18283: - Summary: Add a test to make sure the default starting offset is latest Key: SPARK-18283 URL: https://issues.apache.org/jira/browse/SPARK-18283 Project: Spark

[jira] [Commented] (SPARK-18282) Add model summaries for Python GMM and BisectingKMeans

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638190#comment-15638190 ] Apache Spark commented on SPARK-18282: -- User 'sethah' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18282) Add model summaries for Python GMM and BisectingKMeans

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18282: Assignee: Apache Spark > Add model summaries for Python GMM and BisectingKMeans >

[jira] [Assigned] (SPARK-18282) Add model summaries for Python GMM and BisectingKMeans

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18282: Assignee: (was: Apache Spark) > Add model summaries for Python GMM and

[jira] [Created] (SPARK-18282) Add model summaries for Python GMM and BisectingKMeans

2016-11-04 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-18282: Summary: Add model summaries for Python GMM and BisectingKMeans Key: SPARK-18282 URL: https://issues.apache.org/jira/browse/SPARK-18282 Project: Spark

[jira] [Commented] (SPARK-17710) ReplSuite fails with ClassCircularityError in master Maven builds

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15638114#comment-15638114 ] Apache Spark commented on SPARK-17710: -- User 'weiqingy' has created a pull request for this issue:

[jira] [Updated] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2016-11-04 Thread Luke Miner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Miner updated SPARK-18281: --- Description: I run the example straight out of the api docs for toLocalIterator and it gives a time

[jira] [Created] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2016-11-04 Thread Luke Miner (JIRA)
Luke Miner created SPARK-18281: -- Summary: toLocalIterator yields time out error on pyspark2 Key: SPARK-18281 URL: https://issues.apache.org/jira/browse/SPARK-18281 Project: Spark Issue Type:

[jira] [Updated] (SPARK-16804) Correlated subqueries containing non-deterministic operators return incorrect results

2016-11-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16804: Fix Version/s: 2.0.2 > Correlated subqueries containing non-deterministic operators return

[jira] [Updated] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17337: Fix Version/s: 2.0.2 > Incomplete algorithm for name resolution in Catalyst paser may lead to >

[jira] [Commented] (SPARK-16804) Correlated subqueries containing non-deterministic operators return incorrect results

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637905#comment-15637905 ] Apache Spark commented on SPARK-16804: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18280) Potential deadlock in `StandaloneSchedulerBackend.dead`

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18280: Assignee: (was: Apache Spark) > Potential deadlock in

[jira] [Commented] (SPARK-18280) Potential deadlock in `StandaloneSchedulerBackend.dead`

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637873#comment-15637873 ] Apache Spark commented on SPARK-18280: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18280) Potential deadlock in `StandaloneSchedulerBackend.dead`

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18280: Assignee: Apache Spark > Potential deadlock in `StandaloneSchedulerBackend.dead` >

[jira] [Created] (SPARK-18280) Potential deadlock in `StandaloneSchedulerBackend.dead`

2016-11-04 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18280: Summary: Potential deadlock in `StandaloneSchedulerBackend.dead` Key: SPARK-18280 URL: https://issues.apache.org/jira/browse/SPARK-18280 Project: Spark

[jira] [Updated] (SPARK-18280) Potential deadlock in `StandaloneSchedulerBackend.dead`

2016-11-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18280: - Affects Version/s: 1.6.2 2.0.0 2.0.1 > Potential

[jira] [Commented] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637824#comment-15637824 ] Apache Spark commented on SPARK-18189: -- User 'seyfe' has created a pull request for this issue:

[jira] [Updated] (SPARK-18279) ML programming guide should have R examples

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-18279: - Affects Version/s: 2.1.0 Target Version/s: 2.1.0 > ML programming guide should have R

[jira] [Updated] (SPARK-18279) ML programming guide should have R examples

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-18279: - Description: http://spark.apache.org/docs/latest/ml-classification-regression.html for example,

[jira] [Created] (SPARK-18279) ML programming guide should have R examples

2016-11-04 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-18279: Summary: ML programming guide should have R examples Key: SPARK-18279 URL: https://issues.apache.org/jira/browse/SPARK-18279 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18266) Update R vignettes and programming guide for 2.1.0 release

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637756#comment-15637756 ] Felix Cheung commented on SPARK-18266: -- Actually, I just realize the ML programming guide (not just

[jira] [Closed] (SPARK-18273) DataFrameReader.load takes a lot of time to start the job if a lot of file/dir paths are pass

2016-11-04 Thread Aniket Bhatnagar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Bhatnagar closed SPARK-18273. > DataFrameReader.load takes a lot of time to start the job if a lot of > file/dir paths are

[jira] [Resolved] (SPARK-18273) DataFrameReader.load takes a lot of time to start the job if a lot of file/dir paths are pass

2016-11-04 Thread Aniket Bhatnagar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Bhatnagar resolved SPARK-18273. -- Resolution: Not A Problem Glob patterns can be passed instead of full paths to reduce

[jira] [Commented] (SPARK-18273) DataFrameReader.load takes a lot of time to start the job if a lot of file/dir paths are pass

2016-11-04 Thread Aniket Bhatnagar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637731#comment-15637731 ] Aniket Bhatnagar commented on SPARK-18273: -- Thanks [~srowen]. Didn't realize that I could

[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637713#comment-15637713 ] Nicholas Chammas commented on SPARK-18277: -- {quote} If you try {{when()}}, you realize that you

[jira] [Assigned] (SPARK-18276) Some ML training summaries are not copied when {{copy()}} is called.

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18276: Assignee: Apache Spark > Some ML training summaries are not copied when {{copy()}} is

[jira] [Commented] (SPARK-18276) Some ML training summaries are not copied when {{copy()}} is called.

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637671#comment-15637671 ] Apache Spark commented on SPARK-18276: -- User 'sethah' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18276) Some ML training summaries are not copied when {{copy()}} is called.

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18276: Assignee: (was: Apache Spark) > Some ML training summaries are not copied when

[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637654#comment-15637654 ] Nicholas Chammas commented on SPARK-18277: -- Thanks for the pointer. I'll follow the discussion

[jira] [Updated] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-11-04 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Erlandson updated SPARK-18278: --- External issue URL: https://github.com/kubernetes/kubernetes/issues/34377 External issue

[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637621#comment-15637621 ] Cody Koeninger commented on SPARK-18258: So one obvious one is that if wherever checkpoint data

[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637622#comment-15637622 ] Michael Armbrust commented on SPARK-18277: -- We've been talking about better support for nested

[jira] [Updated] (SPARK-14387) Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14387: -- Target Version/s: (was: 2.0.2) > Enable Hive-1.x ORC compatibility with

[jira] [Comment Edited] (SPARK-18266) Update R vignettes and programming guide for 2.1.0 release

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637604#comment-15637604 ] Felix Cheung edited comment on SPARK-18266 at 11/4/16 8:38 PM: --- I'm not

[jira] [Updated] (SPARK-14241) Output of monotonically_increasing_id lacks stable relation with rows of DataFrame

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-14241: -- Assignee: Cheng Lian > Output of monotonically_increasing_id lacks stable relation with rows of >

[jira] [Commented] (SPARK-18266) Update R vignettes and programming guide for 2.1.0 release

2016-11-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637604#comment-15637604 ] Felix Cheung commented on SPARK-18266: -- I'm not sure it is, actually. If I recall there shouldn't be

[jira] [Assigned] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-17337: - Assignee: Herman van Hovell > Incomplete algorithm for name resolution in

[jira] [Commented] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637600#comment-15637600 ] Apache Spark commented on SPARK-17337: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-11-04 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637602#comment-15637602 ] Anirudh Ramanathan commented on SPARK-18278: Corresponding issue in kubernetes:

[jira] [Resolved] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17337. --- Resolution: Fixed Fix Version/s: 2.1.0 > Incomplete algorithm for name

[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637595#comment-15637595 ] Michael Armbrust commented on SPARK-18258: -- What sort of failures are you anticipating here? >

[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-11-04 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637591#comment-15637591 ] Erik Erlandson commented on SPARK-18278: Current prototype:

[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637576#comment-15637576 ] Cody Koeninger commented on SPARK-18258: The sink doesn't have to reason about equality of the

[jira] [Comment Edited] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637566#comment-15637566 ] Nicholas Chammas edited comment on SPARK-18277 at 11/4/16 8:25 PM: ---

[jira] [Created] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-11-04 Thread Erik Erlandson (JIRA)
Erik Erlandson created SPARK-18278: -- Summary: Support native submission of spark jobs to a kubernetes cluster Key: SPARK-18278 URL: https://issues.apache.org/jira/browse/SPARK-18278 Project: Spark

[jira] [Updated] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18277: - Description: It appears that you cannot use {{fill()}} and friends to quickly modify

[jira] [Commented] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637566#comment-15637566 ] Nicholas Chammas commented on SPARK-18277: -- [~marmbrus] / [~yhuai]: Is there is workaround for

[jira] [Updated] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-18277: - Description: It appears that you cannot use {{fill()}} and friends to quickly modify

[jira] [Commented] (SPARK-18081) Locality Sensitive Hashing (LSH) User Guide

2016-11-04 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637493#comment-15637493 ] Yun Ni commented on SPARK-18081: This is super helpful. Thanks! > Locality Sensitive Hashing (LSH) User

[jira] [Created] (SPARK-18277) na.fill() and friends should work on struct fields

2016-11-04 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-18277: Summary: na.fill() and friends should work on struct fields Key: SPARK-18277 URL: https://issues.apache.org/jira/browse/SPARK-18277 Project: Spark

[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637491#comment-15637491 ] Michael Armbrust commented on SPARK-18258: -- I agree that we don't want to lock people in, which

[jira] [Created] (SPARK-18276) Some ML training summaries are not copied when {{copy()}} is called.

2016-11-04 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-18276: Summary: Some ML training summaries are not copied when {{copy()}} is called. Key: SPARK-18276 URL: https://issues.apache.org/jira/browse/SPARK-18276

[jira] [Updated] (SPARK-18258) Sinks need access to offset representation

2016-11-04 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18258: - Target Version/s: 2.2.0 > Sinks need access to offset representation >

[jira] [Commented] (SPARK-18081) Locality Sensitive Hashing (LSH) User Guide

2016-11-04 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637424#comment-15637424 ] Seth Hendrickson commented on SPARK-18081: -- No worries, just wanted to check in to see if you

[jira] [Resolved] (SPARK-18197) Optimise AppendOnlyMap implementation

2016-11-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18197. - Resolution: Fixed Assignee: Adam Roberts Fix Version/s: 2.1.0 > Optimise

[jira] [Assigned] (SPARK-18260) from_json can throw a better exception when it can't find the column or be nullSafe

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18260: Assignee: Apache Spark > from_json can throw a better exception when it can't find the

[jira] [Commented] (SPARK-18260) from_json can throw a better exception when it can't find the column or be nullSafe

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637290#comment-15637290 ] Apache Spark commented on SPARK-18260: -- User 'brkyvz' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18260) from_json can throw a better exception when it can't find the column or be nullSafe

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18260: Assignee: (was: Apache Spark) > from_json can throw a better exception when it can't

[jira] [Updated] (SPARK-18273) DataFrameReader.load takes a lot of time to start the job if a lot of file/dir paths are pass

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18273: -- Priority: Minor (was: Major) I'm not sure it's worth the complexity. How about passing a glob

[jira] [Commented] (SPARK-18266) Update R vignettes and programming guide for 2.1.0 release

2016-11-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637204#comment-15637204 ] Miao Wang commented on SPARK-18266: --- [~felixcheung] Is this an umbrella JIRA? > Update R vignettes and

[jira] [Resolved] (SPARK-18193) queueStream not updated if rddQueue.add after create queueStream in Java

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18193. --- Resolution: Not A Problem I looked into this further and found that it does work to add RDDs after

[jira] [Resolved] (SPARK-18065) Spark 2 allows filter/where on columns not in current schema

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18065. --- Resolution: Not A Problem > Spark 2 allows filter/where on columns not in current schema >

[jira] [Resolved] (SPARK-17969) I think it's user unfriendly to process standard json file with DataFrame

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17969. --- Resolution: Won't Fix > I think it's user unfriendly to process standard json file with DataFrame >

[jira] [Resolved] (SPARK-17945) Writing to S3 should allow setting object metadata

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17945. --- Resolution: Won't Fix > Writing to S3 should allow setting object metadata >

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-11-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637189#comment-15637189 ] Miao Wang commented on SPARK-15784: --- I created a new PR to implement PIC as a Transformer. > Add Power

[jira] [Commented] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637186#comment-15637186 ] Nattavut Sutyanyong commented on SPARK-17337: - Totally agreed on your approach. We should

[jira] [Commented] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637167#comment-15637167 ] Herman van Hovell commented on SPARK-17337: --- [~nsyca] You are right to say that this is part of

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-11-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637132#comment-15637132 ] Apache Spark commented on SPARK-15784: -- User 'wangmiao1981' has created a pull request for this

[jira] [Updated] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-18128: Issue Type: Sub-task (was: Improvement) Parent: SPARK-18267 > Add support for publishing to PyPI

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637110#comment-15637110 ] holdenk commented on SPARK-18128: - When I e-mailed [~prabinb] earlier this week I got an out of office

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637111#comment-15637111 ] holdenk commented on SPARK-18128: - Sure > Add support for publishing to PyPI >

[jira] [Commented] (SPARK-18081) Locality Sensitive Hashing (LSH) User Guide

2016-11-04 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637102#comment-15637102 ] Yun Ni commented on SPARK-18081: Sorry, I was really overloaded this week. I will try my best to send a

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637104#comment-15637104 ] holdenk commented on SPARK-18128: - Good call - so publishing to PyPI test has worked fine but there might

[jira] [Updated] (SPARK-17337) Incomplete algorithm for name resolution in Catalyst paser may lead to incorrect result

2016-11-04 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17337: -- Labels: correctness (was: ) > Incomplete algorithm for name resolution in Catalyst

[jira] [Commented] (SPARK-17348) Incorrect results from subquery transformation

2016-11-04 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637099#comment-15637099 ] Herman van Hovell commented on SPARK-17348: --- Yeah, it would be nice if you can consolidate the

[jira] [Resolved] (SPARK-18275) Why does not use an ordered queue in takeOrdered?

2016-11-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18275. --- Resolution: Not A Problem Questions should go to user@ Priority queues are not necessarily sorted

[jira] [Commented] (SPARK-18128) Add support for publishing to PyPI

2016-11-04 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636944#comment-15636944 ] Nicholas Chammas commented on SPARK-18128: -- For the record: Let's also check with the PyPI

[jira] [Commented] (SPARK-17348) Incorrect results from subquery transformation

2016-11-04 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636886#comment-15636886 ] Nattavut Sutyanyong commented on SPARK-17348: - I'd like to a note that a piece of existing

[jira] [Comment Edited] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2016-11-04 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636802#comment-15636802 ] György Süveges edited comment on SPARK-4563 at 11/4/16 4:13 PM: +1

[jira] [Comment Edited] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2016-11-04 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636802#comment-15636802 ] György Süveges edited comment on SPARK-4563 at 11/4/16 4:01 PM: +1

[jira] [Comment Edited] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2016-11-04 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636802#comment-15636802 ] György Süveges edited comment on SPARK-4563 at 11/4/16 4:00 PM: +1

  1   2   >