[jira] [Assigned] (SPARK-24386) implement continuous processing coalesce(1)

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24386: Assignee: Apache Spark > implement continuous processing coalesce(1) >

[jira] [Commented] (SPARK-24386) implement continuous processing coalesce(1)

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511994#comment-16511994 ] Apache Spark commented on SPARK-24386: -- User 'jose-torres' has created a pull request for this

[jira] [Assigned] (SPARK-24386) implement continuous processing coalesce(1)

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24386: Assignee: (was: Apache Spark) > implement continuous processing coalesce(1) >

[jira] [Comment Edited] (SPARK-5152) Let metrics.properties file take an hdfs:// path

2018-06-13 Thread John Zhuge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511724#comment-16511724 ] John Zhuge edited comment on SPARK-5152 at 6/14/18 3:21 AM: SPARK-7169 

[jira] [Commented] (SPARK-24528) Missing optimization for Aggregations/Windowing on a bucketed table

2018-06-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511924#comment-16511924 ] Liang-Chi Hsieh commented on SPARK-24528: - Btw, I think the complete and reproducible examples

[jira] [Resolved] (SPARK-24546) InsertIntoDataSourceCommand make dataframe with wrong schema

2018-06-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24546. -- Resolution: Won't Fix Target Version/s: (was: 2.3.1) >

[jira] [Updated] (SPARK-24540) Support for multiple delimiter in Spark CSV read

2018-06-13 Thread Ashwin K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin K updated SPARK-24540: - Description: Currently, the delimiter option Spark 2.0 to read and split CSV files/data only support a

[jira] [Commented] (SPARK-22239) User-defined window functions with pandas udf

2018-06-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511876#comment-16511876 ] Hyukjin Kwon commented on SPARK-22239: -- Sure, please go ahead. > User-defined window functions

[jira] [Updated] (SPARK-24540) Support for multiple delimiter in Spark CSV read

2018-06-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24540: - Component/s: (was: Spark Core) SQL > Support for multiple delimiter in

[jira] [Commented] (SPARK-24467) VectorAssemblerEstimator

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511852#comment-16511852 ] Joseph K. Bradley commented on SPARK-24467: --- True, we would have to make the VectorAssembler

[jira] [Commented] (SPARK-15882) Discuss distributed linear algebra in spark.ml package

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511845#comment-16511845 ] Joseph K. Bradley commented on SPARK-15882: --- I'm afraid I don't have time to prioritize this

[jira] [Updated] (SPARK-24549) 32BitDecimalType and 64BitDecimalType support push down to the data sources

2018-06-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24549: Issue Type: Improvement (was: New Feature) > 32BitDecimalType and 64BitDecimalType support push

[jira] [Updated] (SPARK-24538) ByteArrayDecimalType support push down to the data sources

2018-06-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24538: Issue Type: Improvement (was: New Feature) > ByteArrayDecimalType support push down to the data

[jira] [Commented] (SPARK-24534) Add a way to bypass entrypoint.sh script if no spark cmd is passed

2018-06-13 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511804#comment-16511804 ] Erik Erlandson commented on SPARK-24534: I think this has potential use for customization beyond

[jira] [Commented] (SPARK-23997) Configurable max number of buckets

2018-06-13 Thread Fernando Pereira (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511800#comment-16511800 ] Fernando Pereira commented on SPARK-23997: -- cc [~cloud_fan] [~tejasp] This a pretty

[jira] [Updated] (SPARK-23732) Broken link to scala source code in Spark Scala api Scaladoc

2018-06-13 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-23732: --- Fix Version/s: 2.2.2 2.1.3 > Broken link to scala source code in Spark

[jira] [Updated] (SPARK-3723) DecisionTree, RandomForest: Add more instrumentation

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3723: - Shepherd: (was: Joseph K. Bradley) > DecisionTree, RandomForest: Add more

[jira] [Updated] (SPARK-3727) Trees and ensembles: More prediction functionality

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3727: - Shepherd: (was: Joseph K. Bradley) > Trees and ensembles: More prediction

[jira] [Updated] (SPARK-5362) Gradient and Optimizer to support generic output (instead of label) and data batches

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5362: - Shepherd: (was: Joseph K. Bradley) > Gradient and Optimizer to support generic output

[jira] [Updated] (SPARK-5556) Latent Dirichlet Allocation (LDA) using Gibbs sampler

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5556: - Shepherd: (was: Joseph K. Bradley) > Latent Dirichlet Allocation (LDA) using Gibbs

[jira] [Updated] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4591: - Shepherd: (was: Joseph K. Bradley) > Algorithm/model parity for spark.ml (Scala) >

[jira] [Updated] (SPARK-8799) OneVsRestModel should extend ClassificationModel

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-8799: - Shepherd: (was: Joseph K. Bradley) > OneVsRestModel should extend ClassificationModel

[jira] [Updated] (SPARK-9120) Add multivariate regression (or prediction) interface

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-9120: - Shepherd: (was: Joseph K. Bradley) > Add multivariate regression (or prediction)

[jira] [Updated] (SPARK-8767) Abstractions for InputColParam, OutputColParam

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-8767: - Shepherd: (was: Joseph K. Bradley) > Abstractions for InputColParam, OutputColParam >

[jira] [Updated] (SPARK-8799) OneVsRestModel should extend ClassificationModel

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-8799: - Target Version/s: (was: 3.0.0) > OneVsRestModel should extend ClassificationModel >

[jira] [Updated] (SPARK-7424) spark.ml classification, regression abstractions should add metadata to output column

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7424: - Shepherd: (was: Joseph K. Bradley) > spark.ml classification, regression abstractions

[jira] [Updated] (SPARK-21166) Automated ML persistence

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-21166: -- Shepherd: (was: Joseph K. Bradley) > Automated ML persistence >

[jira] [Updated] (SPARK-14585) Provide accessor methods for Pipeline stages

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14585: -- Shepherd: (was: Joseph K. Bradley) > Provide accessor methods for Pipeline stages >

[jira] [Updated] (SPARK-19591) Add sample weights to decision trees

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19591: -- Shepherd: (was: Joseph K. Bradley) > Add sample weights to decision trees >

[jira] [Updated] (SPARK-9140) Replace TimeTracker by Stopwatch

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-9140: - Shepherd: (was: Joseph K. Bradley) > Replace TimeTracker by Stopwatch >

[jira] [Updated] (SPARK-15573) Backwards-compatible persistence for spark.ml

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15573: -- Shepherd: (was: Joseph K. Bradley) > Backwards-compatible persistence for spark.ml

[jira] [Updated] (SPARK-19498) Discussion: Making MLlib APIs extensible for 3rd party libraries

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19498: -- Shepherd: (was: Joseph K. Bradley) > Discussion: Making MLlib APIs extensible for

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-24359: -- Shepherd: Xiangrui Meng (was: Joseph K. Bradley) > SPIP: ML Pipelines in R >

[jira] [Updated] (SPARK-24097) Instruments improvements - RandomForest and GradientBoostedTree

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-24097: -- Shepherd: (was: Joseph K. Bradley) > Instruments improvements - RandomForest and

[jira] [Updated] (SPARK-21926) Compatibility between ML Transformers and Structured Streaming

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-21926: -- Shepherd: (was: Joseph K. Bradley) > Compatibility between ML Transformers and

[jira] [Updated] (SPARK-24212) PrefixSpan in spark.ml: user guide section

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-24212: -- Shepherd: (was: Joseph K. Bradley) > PrefixSpan in spark.ml: user guide section >

[jira] [Resolved] (SPARK-14376) spark.ml parity for trees

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14376. --- Resolution: Fixed Fix Version/s: 2.4.0 > spark.ml parity for trees >

[jira] [Assigned] (SPARK-10817) ML abstraction umbrella

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-10817: - Assignee: (was: Joseph K. Bradley) > ML abstraction umbrella >

[jira] [Assigned] (SPARK-5572) LDA improvement listing

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-5572: Assignee: (was: Joseph K. Bradley) > LDA improvement listing >

[jira] [Assigned] (SPARK-4285) Transpose RDD[Vector] to column store for ML

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-4285: Assignee: (was: Joseph K. Bradley) > Transpose RDD[Vector] to column store

[jira] [Updated] (SPARK-5572) LDA improvement listing

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5572: - Shepherd: (was: Joseph K. Bradley) > LDA improvement listing > ---

[jira] [Assigned] (SPARK-7206) Gaussian Mixture Model (GMM) improvements

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-7206: Assignee: (was: Joseph K. Bradley) > Gaussian Mixture Model (GMM)

[jira] [Commented] (SPARK-14376) spark.ml parity for trees

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511751#comment-16511751 ] Joseph K. Bradley commented on SPARK-14376: --- Thanks! I'll close it. > spark.ml parity for

[jira] [Assigned] (SPARK-14604) Modify design of ML model summaries

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-14604: - Assignee: (was: Joseph K. Bradley) > Modify design of ML model summaries >

[jira] [Commented] (SPARK-19609) Broadcast joins should pushdown join constraints as Filter to the larger relation

2018-06-13 Thread David McLennan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511746#comment-16511746 ] David McLennan commented on SPARK-19609: This feature would be extremely useful in making

[jira] [Commented] (SPARK-22666) Spark datasource for image format

2018-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511745#comment-16511745 ] Joseph K. Bradley commented on SPARK-22666: --- Side note: The Java library we use for reading

[jira] [Comment Edited] (SPARK-24530) pyspark.ml doesn't generate class docs correctly

2018-06-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511730#comment-16511730 ] Dongjoon Hyun edited comment on SPARK-24530 at 6/13/18 10:28 PM: - Hi,

[jira] [Updated] (SPARK-24530) pyspark.ml doesn't generate class docs correctly

2018-06-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24530: -- Attachment: image-2018-06-13-15-15-51-025.png > pyspark.ml doesn't generate class docs

[jira] [Commented] (SPARK-5152) Let metrics.properties file take an hdfs:// path

2018-06-13 Thread John Zhuge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511724#comment-16511724 ] John Zhuge commented on SPARK-5152: --- SPARK-7169 alleviated this issue, however, still find this

[jira] [Updated] (SPARK-24530) pyspark.ml doesn't generate class docs correctly

2018-06-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24530: -- Description: I generated python docs from master locally using `make html`. However, the

[jira] [Commented] (SPARK-24530) pyspark.ml doesn't generate class docs correctly

2018-06-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511730#comment-16511730 ] Dongjoon Hyun commented on SPARK-24530: --- Hi, [~mengxr] . I got the following locally. It

[jira] [Resolved] (SPARK-24531) HiveExternalCatalogVersionsSuite failing due to missing 2.2.0 version

2018-06-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24531. - Resolution: Fixed Assignee: Marco Gaido Fix Version/s: 2.4.0 >

[jira] [Comment Edited] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511695#comment-16511695 ] Jiang Xingbo edited comment on SPARK-24552 at 6/13/18 9:47 PM: --- IIUC

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511695#comment-16511695 ] Jiang Xingbo commented on SPARK-24552: -- IIUC stageAttemptId + taskAttemptId shall probably define a

[jira] [Updated] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-06-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23874: - Description: Version 0.10.0 will allow for the following improvements and bug fixes: * Allow

[jira] [Commented] (SPARK-24554) Add MapType Support for Arrow in PySpark

2018-06-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511691#comment-16511691 ] Bryan Cutler commented on SPARK-24554: -- There still is work to be done to add a Map logical type to

[jira] [Created] (SPARK-24554) Add MapType Support for Arrow in PySpark

2018-06-13 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-24554: Summary: Add MapType Support for Arrow in PySpark Key: SPARK-24554 URL: https://issues.apache.org/jira/browse/SPARK-24554 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511674#comment-16511674 ] Imran Rashid commented on SPARK-24552: -- I wouldn't call this a bug in the scheduler, though I agree

[jira] [Commented] (SPARK-24525) Provide an option to limit MemorySink memory usage

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511655#comment-16511655 ] Apache Spark commented on SPARK-24525: -- User 'mukulmurthy' has created a pull request for this

[jira] [Assigned] (SPARK-24525) Provide an option to limit MemorySink memory usage

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24525: Assignee: (was: Apache Spark) > Provide an option to limit MemorySink memory usage >

[jira] [Assigned] (SPARK-24525) Provide an option to limit MemorySink memory usage

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24525: Assignee: Apache Spark > Provide an option to limit MemorySink memory usage >

[jira] [Created] (SPARK-24553) Job UI redirect causing http 302 error

2018-06-13 Thread Steven Kallman (JIRA)
Steven Kallman created SPARK-24553: -- Summary: Job UI redirect causing http 302 error Key: SPARK-24553 URL: https://issues.apache.org/jira/browse/SPARK-24553 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-24415) Stage page aggregated executor metrics wrong when failures

2018-06-13 Thread Ankur Gupta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511618#comment-16511618 ] Ankur Gupta commented on SPARK-24415: - I am planning to work on this JIRA > Stage page aggregated

[jira] [Assigned] (SPARK-24235) create the top-of-task RDD sending rows to the remote buffer

2018-06-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-24235: Assignee: Jose Torres > create the top-of-task RDD sending rows to the remote buffer >

[jira] [Resolved] (SPARK-24235) create the top-of-task RDD sending rows to the remote buffer

2018-06-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-24235. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21428

[jira] [Assigned] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24552: Assignee: Apache Spark > Task attempt numbers are reused when stages are retried >

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511593#comment-16511593 ] Apache Spark commented on SPARK-24552: -- User 'rdblue' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24552: Assignee: (was: Apache Spark) > Task attempt numbers are reused when stages are

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511590#comment-16511590 ] Ryan Blue commented on SPARK-24552: --- cc [~vanzin], [~henryr], [~cloud_fan] > Task attempt numbers are

[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-13 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-24552: -- Summary: Task attempt numbers are reused when stages are retried (was: Task attempt ids are reused

[jira] [Updated] (SPARK-24552) Task attempt ids are reused when stages are retried

2018-06-13 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-24552: -- Description: When stages are retried due to shuffle failures, task attempt numbers are reused. This

[jira] [Created] (SPARK-24552) Task attempt ids are reused when stages are retried

2018-06-13 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-24552: - Summary: Task attempt ids are reused when stages are retried Key: SPARK-24552 URL: https://issues.apache.org/jira/browse/SPARK-24552 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24552) Task attempt ids are reused when stages are retried

2018-06-13 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-24552: -- Description: When stages are retried due to shuffle failures, task attempt ids are reused. This

[jira] [Commented] (SPARK-24528) Missing optimization for Aggregations/Windowing on a bucketed table

2018-06-13 Thread Ohad Raviv (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511578#comment-16511578 ] Ohad Raviv commented on SPARK-24528: I think the 2nd point better suits my usecase. i'll try to look

[jira] [Commented] (SPARK-24528) Missing optimization for Aggregations/Windowing on a bucketed table

2018-06-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511533#comment-16511533 ] Wenchen Fan commented on SPARK-24528: - I have 2 ideas: 1. provide an option to let Spark only

[jira] [Commented] (SPARK-22239) User-defined window functions with pandas udf

2018-06-13 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511495#comment-16511495 ] Li Jin commented on SPARK-22239: [~hyukjin.kwon] I actually don't think this Jira is done. The PR only

[jira] [Commented] (SPARK-24528) Missing optimization for Aggregations/Windowing on a bucketed table

2018-06-13 Thread Ohad Raviv (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511483#comment-16511483 ] Ohad Raviv commented on SPARK-24528: I understand the tradeoff, the question is how could we

[jira] [Assigned] (SPARK-24439) Add distanceMeasure to BisectingKMeans in PySpark

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24439: Assignee: Apache Spark > Add distanceMeasure to BisectingKMeans in PySpark >

[jira] [Assigned] (SPARK-24439) Add distanceMeasure to BisectingKMeans in PySpark

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24439: Assignee: (was: Apache Spark) > Add distanceMeasure to BisectingKMeans in PySpark >

[jira] [Commented] (SPARK-24439) Add distanceMeasure to BisectingKMeans in PySpark

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511482#comment-16511482 ] Apache Spark commented on SPARK-24439: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Updated] (SPARK-24439) Add distanceMeasure to BisectingKMeans in PySpark

2018-06-13 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-24439: --- Priority: Minor (was: Major) > Add distanceMeasure to BisectingKMeans in PySpark >

[jira] [Commented] (SPARK-24528) Missing optimization for Aggregations/Windowing on a bucketed table

2018-06-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511452#comment-16511452 ] Wenchen Fan commented on SPARK-24528: - It's a different problem. Spark makes a tradeoff for

[jira] [Commented] (SPARK-24539) HistoryServer does not display metrics from tasks that complete after stage failure

2018-06-13 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511365#comment-16511365 ] Marcelo Vanzin commented on SPARK-24539: >From a previous chat with [~tgraves] this sounds like

[jira] [Issue Comment Deleted] (SPARK-24548) JavaPairRDD to Dataset in SPARK generates ambiguous results

2018-06-13 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-24548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomasz Gawęda updated SPARK-24548: -- Comment: was deleted (was: IMHO names should be distinct, in other cases it's hard to query

[jira] [Resolved] (SPARK-24500) UnsupportedOperationException when trying to execute Union plan with Stream of children

2018-06-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24500. - Resolution: Fixed Fix Version/s: 2.4.0 > UnsupportedOperationException when trying to

[jira] [Created] (SPARK-24551) Add Integration tests for Secrets

2018-06-13 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-24551: --- Summary: Add Integration tests for Secrets Key: SPARK-24551 URL: https://issues.apache.org/jira/browse/SPARK-24551 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24550) Registration of Kubernetes specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Summary: Registration of Kubernetes specific metrics (was: Registration of K8s

[jira] [Updated] (SPARK-24550) Add support for Kubernetes specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Summary: Add support for Kubernetes specific metrics (was: Registration of

[jira] [Updated] (SPARK-24550) Registration of K8s specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Description: Spark by default offers a specific set of metrics for monitoring. It

[jira] [Updated] (SPARK-24550) Registration of K8s specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Issue Type: New Feature (was: Bug) > Registration of K8s specific metrics >

[jira] [Updated] (SPARK-24550) Registration of K8s specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Description: Spark by default offers a specific set of metrics for monitoring. It

[jira] [Updated] (SPARK-24550) Registration of K8s specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Description: Spark by default offers a specific set of metrics for monitoring. It

[jira] [Updated] (SPARK-24550) Registration of K8s specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24550: Description: Spark by default offers a specific set of metrics for monitoring. It

[jira] [Created] (SPARK-24550) Registration of K8s specific metrics

2018-06-13 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-24550: --- Summary: Registration of K8s specific metrics Key: SPARK-24550 URL: https://issues.apache.org/jira/browse/SPARK-24550 Project: Spark Issue

[jira] [Assigned] (SPARK-24479) Register StreamingQueryListener in Spark Conf

2018-06-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-24479: Assignee: Arun Mahadevan > Register StreamingQueryListener in Spark Conf >

[jira] [Resolved] (SPARK-24479) Register StreamingQueryListener in Spark Conf

2018-06-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24479. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21504

[jira] [Commented] (SPARK-24548) JavaPairRDD to Dataset in SPARK generates ambiguous results

2018-06-13 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-24548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511056#comment-16511056 ] Tomasz Gawęda commented on SPARK-24548: --- IMHO names should be distinct, in other cases it's hard

[jira] [Issue Comment Deleted] (SPARK-24549) 32BitDecimalType and 64BitDecimalType support push down to the data sources

2018-06-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24549: Comment: was deleted (was: I'm working on this) > 32BitDecimalType and 64BitDecimalType support

[jira] [Assigned] (SPARK-24549) 32BitDecimalType and 64BitDecimalType support push down to the data sources

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24549: Assignee: (was: Apache Spark) > 32BitDecimalType and 64BitDecimalType support push

[jira] [Commented] (SPARK-24549) 32BitDecimalType and 64BitDecimalType support push down to the data sources

2018-06-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511049#comment-16511049 ] Apache Spark commented on SPARK-24549: -- User 'wangyum' has created a pull request for this issue:

  1   2   >