[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513384#comment-16513384 ] Shivaram Venkataraman commented on SPARK-24359: --- Yes - thats what I meant [~felixcheung]

[jira] [Commented] (SPARK-24566) spark.storage.blockManagerSlaveTimeoutMs default config

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513379#comment-16513379 ] Apache Spark commented on SPARK-24566: -- User 'xueyumusic' has created a pull request for this

[jira] [Assigned] (SPARK-24566) spark.storage.blockManagerSlaveTimeoutMs default config

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24566: Assignee: (was: Apache Spark) > spark.storage.blockManagerSlaveTimeoutMs default

[jira] [Assigned] (SPARK-24566) spark.storage.blockManagerSlaveTimeoutMs default config

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24566: Assignee: Apache Spark > spark.storage.blockManagerSlaveTimeoutMs default config >

[jira] [Updated] (SPARK-24566) spark.storage.blockManagerSlaveTimeoutMs default config

2018-06-14 Thread xueyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xueyu updated SPARK-24566: -- External issue URL: https://github.com/apache/spark/pull/21575 > spark.storage.blockManagerSlaveTimeoutMs

[jira] [Commented] (SPARK-24535) Fix java version parsing in SparkR

2018-06-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513372#comment-16513372 ] Felix Cheung commented on SPARK-24535: -- is this only failing on windows? I wonder if this is

[jira] [Created] (SPARK-24566) spark.storage.blockManagerSlaveTimeoutMs default config

2018-06-14 Thread xueyu (JIRA)
xueyu created SPARK-24566: - Summary: spark.storage.blockManagerSlaveTimeoutMs default config Key: SPARK-24566 URL: https://issues.apache.org/jira/browse/SPARK-24566 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513363#comment-16513363 ] Felix Cheung commented on SPARK-24359: -- [~shivaram] sure - do you mean 2.3.1.1 though? 2.4.0

[jira] [Resolved] (SPARK-24267) explicitly keep DataSourceReader in DataSourceV2Relation

2018-06-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24267. - Resolution: Won't Fix > explicitly keep DataSourceReader in DataSourceV2Relation >

[jira] [Commented] (SPARK-24478) DataSourceV2 should push filters and projection at physical plan conversion

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513332#comment-16513332 ] Apache Spark commented on SPARK-24478: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24478) DataSourceV2 should push filters and projection at physical plan conversion

2018-06-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24478: --- Assignee: Ryan Blue > DataSourceV2 should push filters and projection at physical plan

[jira] [Resolved] (SPARK-24478) DataSourceV2 should push filters and projection at physical plan conversion

2018-06-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24478. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21503

[jira] [Updated] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread xueyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xueyu updated SPARK-24560: -- Docs Text: (was: There are some places using "getTimeAsMs" rather than "getTimeAsSeconds". This will return

[jira] [Updated] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread xueyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xueyu updated SPARK-24560: -- Description: There are some places using "getTimeAsMs" rather than "getTimeAsSeconds". This will return a

[jira] [Commented] (SPARK-21743) top-most limit should not cause memory leak

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513233#comment-16513233 ] Apache Spark commented on SPARK-21743: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Updated] (SPARK-24565) Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame

2018-06-14 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-24565: -- Description: Currently, the micro-batches in the MicroBatchExecution is not exposed to the

[jira] [Commented] (SPARK-24534) Add a way to bypass entrypoint.sh script if no spark cmd is passed

2018-06-14 Thread Ricardo Martinelli de Oliveira (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513159#comment-16513159 ] Ricardo Martinelli de Oliveira commented on SPARK-24534: Guys, I sent a PR as

[jira] [Resolved] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods

2018-06-14 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24248. Resolution: Fixed Fix Version/s: 2.4.0 > [K8S] Use the Kubernetes cluster as the backing

[jira] [Assigned] (SPARK-24565) Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24565: Assignee: Tathagata Das (was: Apache Spark) > Add API for in Structured Streaming for

[jira] [Assigned] (SPARK-24565) Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24565: Assignee: Apache Spark (was: Tathagata Das) > Add API for in Structured Streaming for

[jira] [Commented] (SPARK-24565) Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513110#comment-16513110 ] Apache Spark commented on SPARK-24565: -- User 'tdas' has created a pull request for this issue:

[jira] [Created] (SPARK-24565) Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame

2018-06-14 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-24565: - Summary: Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame Key: SPARK-24565 URL: https://issues.apache.org/jira/browse/SPARK-24565

[jira] [Commented] (SPARK-24564) Add test suite for RecordBinaryComparator

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513095#comment-16513095 ] Apache Spark commented on SPARK-24564: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-24564) Add test suite for RecordBinaryComparator

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24564: Assignee: Apache Spark > Add test suite for RecordBinaryComparator >

[jira] [Assigned] (SPARK-24564) Add test suite for RecordBinaryComparator

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24564: Assignee: (was: Apache Spark) > Add test suite for RecordBinaryComparator >

[jira] [Created] (SPARK-24564) Add test suite for RecordBinaryComparator

2018-06-14 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-24564: Summary: Add test suite for RecordBinaryComparator Key: SPARK-24564 URL: https://issues.apache.org/jira/browse/SPARK-24564 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-06-14 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513064#comment-16513064 ] Yinan Li commented on SPARK-24434: -- [~skonto] Thanks! Will take a look at the design doc once I'm back

[jira] [Resolved] (SPARK-24319) run-example can not print usage

2018-06-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-24319. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21450

[jira] [Assigned] (SPARK-24319) run-example can not print usage

2018-06-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-24319: -- Assignee: Gabor Somogyi > run-example can not print usage >

[jira] [Commented] (SPARK-24559) Some zip files passed with spark-submit --archives causing "invalid CEN header" error

2018-06-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513036#comment-16513036 ] Marcelo Vanzin commented on SPARK-24559: {{\-\-archives}} is completely handled by YARN, so if

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513008#comment-16513008 ] Hossein Falaki commented on SPARK-24359: [~shivaram] I like that. > SPIP: ML Pipelines in R >

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513001#comment-16513001 ] Shivaram Venkataraman commented on SPARK-24359: --- Sounds good. Thanks [~falaki].

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-14 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512959#comment-16512959 ] Hossein Falaki commented on SPARK-24359: Considering that I am volunteering myself to do the

[jira] [Commented] (SPARK-24534) Add a way to bypass entrypoint.sh script if no spark cmd is passed

2018-06-14 Thread Trevor McKay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512949#comment-16512949 ] Trevor McKay commented on SPARK-24534: -- This is useful in situations like this for the

[jira] [Resolved] (SPARK-24543) Support any DataType as DDL string for from_json's schema

2018-06-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24543. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21550

[jira] [Assigned] (SPARK-24543) Support any DataType as DDL string for from_json's schema

2018-06-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24543: --- Assignee: Maxim Gekk > Support any DataType as DDL string for from_json's schema >

[jira] [Resolved] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-24563. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21569

[jira] [Assigned] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-24563: -- Assignee: Li Jin > Allow running PySpark shell without Hive >

[jira] [Commented] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512893#comment-16512893 ] Apache Spark commented on SPARK-24563: -- User 'icexelloss' has created a pull request for this

[jira] [Assigned] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24563: Assignee: Apache Spark > Allow running PySpark shell without Hive >

[jira] [Assigned] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24563: Assignee: (was: Apache Spark) > Allow running PySpark shell without Hive >

[jira] [Commented] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512844#comment-16512844 ] Li Jin commented on SPARK-24563: Will submit a PR soon > Allow running PySpark shell without Hive >

[jira] [Updated] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-24563: --- Description: A previous commit: 

[jira] [Created] (SPARK-24563) Allow running PySpark shell without Hive

2018-06-14 Thread Li Jin (JIRA)
Li Jin created SPARK-24563: -- Summary: Allow running PySpark shell without Hive Key: SPARK-24563 URL: https://issues.apache.org/jira/browse/SPARK-24563 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-24562) Allow running same tests with multiple configs in SQLQueryTestSuite

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24562: Assignee: (was: Apache Spark) > Allow running same tests with multiple configs in

[jira] [Commented] (SPARK-24562) Allow running same tests with multiple configs in SQLQueryTestSuite

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512769#comment-16512769 ] Apache Spark commented on SPARK-24562: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24562) Allow running same tests with multiple configs in SQLQueryTestSuite

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24562: Assignee: Apache Spark > Allow running same tests with multiple configs in

[jira] [Created] (SPARK-24562) Allow running same tests with multiple configs in SQLQueryTestSuite

2018-06-14 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24562: --- Summary: Allow running same tests with multiple configs in SQLQueryTestSuite Key: SPARK-24562 URL: https://issues.apache.org/jira/browse/SPARK-24562 Project: Spark

[jira] [Commented] (SPARK-17025) Cannot persist PySpark ML Pipeline model that includes custom Transformer

2018-06-14 Thread Rafael Hernandez Murcia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512732#comment-16512732 ] Rafael Hernandez Murcia commented on SPARK-17025: - Any news about this? It seems that

[jira] [Commented] (SPARK-24495) SortMergeJoin with duplicate keys wrong results

2018-06-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512711#comment-16512711 ] Xiao Li commented on SPARK-24495: - We might need to release 2.3.2 since this is a serious bug. The

[jira] [Resolved] (SPARK-24495) SortMergeJoin with duplicate keys wrong results

2018-06-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24495. - Resolution: Fixed Assignee: Marco Gaido Fix Version/s: 2.4.0 2.3.2 >

[jira] [Updated] (SPARK-24495) SortMergeJoin with duplicate keys wrong results

2018-06-14 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24495: Priority: Blocker (was: Major) > SortMergeJoin with duplicate keys wrong results >

[jira] [Commented] (SPARK-22148) TaskSetManager.abortIfCompletelyBlacklisted should not abort when all current executors are blacklisted but dynamic allocation is enabled

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512691#comment-16512691 ] Thomas Graves commented on SPARK-22148: --- ok, just update if you start working on it. thanks. >

[jira] [Commented] (SPARK-24539) HistoryServer does not display metrics from tasks that complete after stage failure

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512643#comment-16512643 ] Thomas Graves commented on SPARK-24539: --- Its possible, I thought when I checked the history server

[jira] [Updated] (SPARK-24553) Job UI redirect causing http 302 error

2018-06-14 Thread Steven Kallman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Kallman updated SPARK-24553: --- Description: When on spark UI port 4040 jobs or stages tab, the href links for the

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512519#comment-16512519 ] Thomas Graves commented on SPARK-24552: --- sorry just realized the v2 api is still marked experiment

[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24552: -- Priority: Critical (was: Blocker) > Task attempt numbers are reused when stages are retried

[jira] [Commented] (SPARK-22148) TaskSetManager.abortIfCompletelyBlacklisted should not abort when all current executors are blacklisted but dynamic allocation is enabled

2018-06-14 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512510#comment-16512510 ] Imran Rashid commented on SPARK-22148: -- [~tgraves] we might be able to work on this soon -- a week

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512504#comment-16512504 ] Thomas Graves commented on SPARK-24552: --- Note if this is a correctness bug and can cause data

[jira] [Updated] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24552: -- Priority: Blocker (was: Major) > Task attempt numbers are reused when stages are retried >

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512500#comment-16512500 ] Thomas Graves commented on SPARK-24552: --- I agree, I don't think changing the attempt number at

[jira] [Updated] (SPARK-22239) User-defined window functions with pandas udf (unbounded window)

2018-06-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-22239: --- Description: Window function is another place we can benefit from vectored udf and add another useful

[jira] [Created] (SPARK-24561) User-defined window functions with pandas udf (bounded window)

2018-06-14 Thread Li Jin (JIRA)
Li Jin created SPARK-24561: -- Summary: User-defined window functions with pandas udf (bounded window) Key: SPARK-24561 URL: https://issues.apache.org/jira/browse/SPARK-24561 Project: Spark Issue

[jira] [Updated] (SPARK-22239) User-defined window functions with pandas udf (unbounded window)

2018-06-14 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-22239: --- Summary: User-defined window functions with pandas udf (unbounded window) (was: User-defined window

[jira] [Comment Edited] (SPARK-13587) Support virtualenv in PySpark

2018-06-14 Thread Matt Mould (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512433#comment-16512433 ] Matt Mould edited comment on SPARK-13587 at 6/14/18 1:21 PM: - What is the

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2018-06-14 Thread Matt Mould (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512433#comment-16512433 ] Matt Mould commented on SPARK-13587: What is the current status of this ticket please? This

[jira] [Updated] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread xueyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xueyu updated SPARK-24560: -- External issue URL: https://github.com/apache/spark/pull/21567 > Fix some getTimeAsMs as getTimeAsSeconds >

[jira] [Commented] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512330#comment-16512330 ] Apache Spark commented on SPARK-24560: -- User 'xueyumusic' has created a pull request for this

[jira] [Assigned] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24560: Assignee: (was: Apache Spark) > Fix some getTimeAsMs as getTimeAsSeconds >

[jira] [Assigned] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24560: Assignee: Apache Spark > Fix some getTimeAsMs as getTimeAsSeconds >

[jira] [Created] (SPARK-24560) Fix some getTimeAsMs as getTimeAsSeconds

2018-06-14 Thread xueyu (JIRA)
xueyu created SPARK-24560: - Summary: Fix some getTimeAsMs as getTimeAsSeconds Key: SPARK-24560 URL: https://issues.apache.org/jira/browse/SPARK-24560 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-18739) Models in pyspark.classification and regression support setXXXCol methods

2018-06-14 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-18739. -- Resolution: Not A Problem > Models in pyspark.classification and regression support setXXXCol

[jira] [Issue Comment Deleted] (SPARK-24558) Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value.

2018-06-14 Thread sandeep katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandeep katta updated SPARK-24558: -- Comment: was deleted (was: pull request https://github.com/apache/spark/pull/21565) > Driver

[jira] [Assigned] (SPARK-24558) Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value.

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24558: Assignee: Apache Spark > Driver prints the wrong info in the log when the executor which

[jira] [Commented] (SPARK-24558) Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value.

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512301#comment-16512301 ] Apache Spark commented on SPARK-24558: -- User 'sandeep-katta' has created a pull request for this

[jira] [Assigned] (SPARK-24558) Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value.

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24558: Assignee: (was: Apache Spark) > Driver prints the wrong info in the log when the

[jira] [Created] (SPARK-24559) Some zip files passed with spark-submit --archives causing "invalid CEN header" error

2018-06-14 Thread James Porritt (JIRA)
James Porritt created SPARK-24559: - Summary: Some zip files passed with spark-submit --archives causing "invalid CEN header" error Key: SPARK-24559 URL: https://issues.apache.org/jira/browse/SPARK-24559

[jira] [Commented] (SPARK-24558) Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value.

2018-06-14 Thread sandeep katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512288#comment-16512288 ] sandeep katta commented on SPARK-24558: --- pull request https://github.com/apache/spark/pull/21565

[jira] [Updated] (SPARK-24327) Verify and normalize a partition column name based on the JDBC resolved schema

2018-06-14 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-24327: - Description: We need to modify JDBC datasource code to verify and normalize a partition

[jira] [Updated] (SPARK-24327) Verify and normalize a partition column name based on the JDBC resolved schema

2018-06-14 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-24327: - Summary: Verify and normalize a partition column name based on the JDBC resolved schema

[jira] [Created] (SPARK-24558) Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value.

2018-06-14 Thread sandeep katta (JIRA)
sandeep katta created SPARK-24558: - Summary: Driver prints the wrong info in the log when the executor which holds cacheBlock is IDLE.Time-out value displayed is not as per configuration value. Key: SPARK-24558

[jira] [Commented] (SPARK-14174) Implement the Mini-Batch KMeans

2018-06-14 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512214#comment-16512214 ] zhengruifeng commented on SPARK-14174: -- [~mlnick] [~mengxr] [~josephkb] Mini-Batch KMeans is much

[jira] [Resolved] (SPARK-19422) Cache input data in algorithms

2018-06-14 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-19422. -- Resolution: Not A Problem > Cache input data in algorithms > -- >

[jira] [Commented] (SPARK-24556) ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512174#comment-16512174 ] Apache Spark commented on SPARK-24556: -- User 'yucai' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24556) ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24556: Assignee: Apache Spark > ReusedExchange should rewrite output partitioning also when

[jira] [Assigned] (SPARK-24556) ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24556: Assignee: (was: Apache Spark) > ReusedExchange should rewrite output partitioning

[jira] [Commented] (SPARK-11107) spark.ml should support more input column types: umbrella

2018-06-14 Thread Lee Dongjin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512162#comment-16512162 ] Lee Dongjin commented on SPARK-11107: - [~josephkb]] Excuse me. Is there any reason this issue is

[jira] [Assigned] (SPARK-24557) ClusteringEvaluator support array input

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24557: Assignee: (was: Apache Spark) > ClusteringEvaluator support array input >

[jira] [Assigned] (SPARK-24557) ClusteringEvaluator support array input

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24557: Assignee: Apache Spark > ClusteringEvaluator support array input >

[jira] [Commented] (SPARK-24557) ClusteringEvaluator support array input

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512158#comment-16512158 ] Apache Spark commented on SPARK-24557: -- User 'zhengruifeng' has created a pull request for this

[jira] [Commented] (SPARK-24530) pyspark.ml doesn't generate class docs correctly

2018-06-14 Thread Lee Dongjin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512159#comment-16512159 ] Lee Dongjin commented on SPARK-24530: - OMG, I am sorry; I was misunderstanding. The documentation is

[jira] [Created] (SPARK-24557) ClusteringEvaluator support array input

2018-06-14 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-24557: Summary: ClusteringEvaluator support array input Key: SPARK-24557 URL: https://issues.apache.org/jira/browse/SPARK-24557 Project: Spark Issue Type:

[jira] [Commented] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2018-06-14 Thread Lee Dongjin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512153#comment-16512153 ] Lee Dongjin commented on SPARK-4591: [~josephkb] Excuse me. By SPARK-14376 was resolved recently, I

[jira] [Updated] (SPARK-24556) ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning

2018-06-14 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24556: -- Description: Currently, ReusedExchange would rewrite output partitioning if child's partitioning is

[jira] [Updated] (SPARK-24556) ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning

2018-06-14 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24556: -- Description: Currently, ReusedExchange would rewrite output partitioning if child's partitioning is

[jira] [Created] (SPARK-24556) ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning

2018-06-14 Thread yucai (JIRA)
yucai created SPARK-24556: - Summary: ReusedExchange should rewrite output partitioning also when child's partitioning is RangePartitioning Key: SPARK-24556 URL: https://issues.apache.org/jira/browse/SPARK-24556

[jira] [Resolved] (SPARK-20932) CountVectorizer support handle persistence

2018-06-14 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-20932. -- Resolution: Not A Problem > CountVectorizer support handle persistence >

[jira] [Resolved] (SPARK-22971) OneVsRestModel should use temporary RawPredictionCol

2018-06-14 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-22971. -- Resolution: Not A Problem > OneVsRestModel should use temporary RawPredictionCol >

[jira] [Assigned] (SPARK-24555) logNumExamples in KMeans/BiKM/GMM/AFT/NB

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24555: Assignee: Apache Spark > logNumExamples in KMeans/BiKM/GMM/AFT/NB >

[jira] [Assigned] (SPARK-24555) logNumExamples in KMeans/BiKM/GMM/AFT/NB

2018-06-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24555: Assignee: (was: Apache Spark) > logNumExamples in KMeans/BiKM/GMM/AFT/NB >

  1   2   >