[jira] [Resolved] (SPARK-27041) large partition data cause pyspark with python2.x oom

2019-03-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27041. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23954

[jira] [Resolved] (SPARK-27125) Add test suite for sql execution page

2019-03-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27125. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24052

[jira] [Assigned] (SPARK-27125) Add test suite for sql execution page

2019-03-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27125: - Assignee: shahid > Add test suite for sql execution page >

[jira] [Resolved] (SPARK-27109) Refactoring of TimestampFormatter and DateFormatter

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27109. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24030

[jira] [Assigned] (SPARK-27109) Refactoring of TimestampFormatter and DateFormatter

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27109: - Assignee: Maxim Gekk > Refactoring of TimestampFormatter and DateFormatter >

[jira] [Commented] (SPARK-26839) on JDK11, IsolatedClientLoader must be able to load java.sql classes

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789706#comment-16789706 ] Sean Owen commented on SPARK-26839: --- I'll open my WIP PR on this. I am not sure datanucleus is the

[jira] [Updated] (SPARK-26860) Improve RangeBetween docs in Pyspark, SparkR

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26860: -- Labels: (was: docs easyfix python) Priority: Minor (was: Major) Component/s:

[jira] [Commented] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789598#comment-16789598 ] Sean Owen commented on SPARK-26961: --- I think registerAsParallelCapable() sounds like it could resolve

[jira] [Resolved] (SPARK-26860) Improve RangeBetween docs in Pyspark, SparkR

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26860. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23946

[jira] [Assigned] (SPARK-26860) Improve RangeBetween docs in Pyspark, SparkR

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26860: - Assignee: Jagadesh Kiran N > Improve RangeBetween docs in Pyspark, SparkR >

[jira] [Assigned] (SPARK-27116) Environment tab must sort Hadoop Configuration by default

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27116: - Assignee: Ajith S > Environment tab must sort Hadoop Configuration by default >

[jira] [Reopened] (SPARK-26860) RangeBetween docs appear to be wrong

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-26860: --- > RangeBetween docs appear to be wrong > - > >

[jira] [Resolved] (SPARK-27116) Environment tab must sort Hadoop Configuration by default

2019-03-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27116. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24038

[jira] [Assigned] (SPARK-24621) WebUI - application 'name' urls point to http instead of https (even when ssl enabled)

2019-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-24621: - Assignee: Gabor Somogyi > WebUI - application 'name' urls point to http instead of https (even

[jira] [Resolved] (SPARK-24621) WebUI - application 'name' urls point to http instead of https (even when ssl enabled)

2019-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24621. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23991

[jira] [Created] (SPARK-27122) YARN test failures in Java 9+

2019-03-10 Thread Sean Owen (JIRA)
Sean Owen created SPARK-27122: - Summary: YARN test failures in Java 9+ Key: SPARK-27122 URL: https://issues.apache.org/jira/browse/SPARK-27122 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-27121) Resolve Scala compiler failure for Java 9+ in REPL

2019-03-10 Thread Sean Owen (JIRA)
Sean Owen created SPARK-27121: - Summary: Resolve Scala compiler failure for Java 9+ in REPL Key: SPARK-27121 URL: https://issues.apache.org/jira/browse/SPARK-27121 Project: Spark Issue Type:

[jira] [Updated] (SPARK-27120) Upgrade scalatest version to 3.0.5

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27120: -- Priority: Trivial (was: Major) > Upgrade scalatest version to 3.0.5 >

[jira] [Resolved] (SPARK-26770) Misleading/unhelpful error message when wrapping a null in an Option

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26770. --- Resolution: Not A Problem > Misleading/unhelpful error message when wrapping a null in an Option >

[jira] [Resolved] (SPARK-25350) Spark Serving

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25350. --- Resolution: Won't Fix > Spark Serving > - > > Key: SPARK-25350 >

[jira] [Resolved] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25982. --- Resolution: Not A Problem > Dataframe write is non blocking in fair scheduling mode >

[jira] [Resolved] (SPARK-26261) Spark does not check completeness temporary file

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26261. --- Resolution: Not A Problem > Spark does not check completeness temporary file >

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788771#comment-16788771 ] Sean Owen commented on SPARK-26555: --- What is the fixed data set that reproduces this, to be clear? And

[jira] [Updated] (SPARK-27090) Removing old LEGACY_DRIVER_IDENTIFIER ("")

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27090: -- Docs Text: The executor ID for the driver has been "driver" rather than "" since Spark 1.5.

[jira] [Updated] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27114: -- Priority: Minor (was: Major) I don't know much about this area. Does it actually try to execute

[jira] [Resolved] (SPARK-25838) Remove formatVersion from Saveable

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25838. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22830

[jira] [Assigned] (SPARK-25838) Remove formatVersion from Saveable

2019-03-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25838: - Assignee: Marco Gaido > Remove formatVersion from Saveable >

[jira] [Resolved] (SPARK-27056) Remove `start-shuffle-service.sh`

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27056. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23975

[jira] [Assigned] (SPARK-27056) Remove `start-shuffle-service.sh`

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27056: - Assignee: liuxian > Remove `start-shuffle-service.sh` > -- >

[jira] [Resolved] (SPARK-8547) xgboost exploration

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-8547. -- Resolution: Won't Fix I think the resolution is "xgboost4j-scala" > xgboost exploration >

[jira] [Resolved] (SPARK-9610) Class and instance weighting for ML

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-9610. -- Resolution: Done Except for GBTs, this is done, so I'm going to close the umbrella > Class and

[jira] [Resolved] (SPARK-9478) Add sample weights to Random Forest

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-9478. -- Resolution: Duplicate > Add sample weights to Random Forest > --- > >

[jira] [Resolved] (SPARK-14599) BaggedPoint should support weighted instances.

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-14599. --- Resolution: Duplicate Weighted points were added with SPARK-19591 > BaggedPoint should support

[jira] [Assigned] (SPARK-27079) Fix typo & Remove useless imports

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27079: - Assignee: EdisonWang > Fix typo & Remove useless imports > - >

[jira] [Resolved] (SPARK-27079) Fix typo & Remove useless imports

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27079. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24000

[jira] [Updated] (SPARK-27079) Fix typo & Remove useless imports

2019-03-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27079: -- Priority: Trivial (was: Minor) Yeah [~EdisonWang] this isnt' meaningful as a JIRA. If it's truly

[jira] [Updated] (SPARK-27081) Support launching executors in existed Pods

2019-03-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27081: -- Target Version/s: (was: 2.4.0) > Support launching executors in existed Pods >

[jira] [Commented] (SPARK-27006) SPIP: .NET bindings for Apache Spark

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786030#comment-16786030 ] Sean Owen commented on SPARK-27006: --- You can use the Apache license, and follow Apache processes,

[jira] [Commented] (SPARK-26742) Bump Kubernetes Client Version to 4.1.2

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785842#comment-16785842 ] Sean Owen commented on SPARK-26742: --- [~shaneknapp] ping me when ready and I can try the PR again. >

[jira] [Commented] (SPARK-26742) Bump Kubernetes Client Version to 4.1.2

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785734#comment-16785734 ] Sean Owen commented on SPARK-26742: --- The problem is that that's a major dependency upgrade in a

[jira] [Resolved] (SPARK-27047) Document stop-slave.sh in spark-standalone

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27047. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23960

[jira] [Updated] (SPARK-26016) Document that UTF-8 is required in text data source

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26016: -- Issue Type: Improvement (was: Bug) Summary: Document that UTF-8 is required in text data

[jira] [Assigned] (SPARK-27047) Document stop-slave.sh in spark-standalone

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27047: - Assignee: Ajith S > Document stop-slave.sh in spark-standalone >

[jira] [Resolved] (SPARK-26602) Subsequent queries are failing after querying the UDF which is loaded with wrong hdfs path

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26602. --- Resolution: Not A Problem > Subsequent queries are failing after querying the UDF which is loaded

[jira] [Resolved] (SPARK-26981) Add 'Recall_at_k' metric to RankingMetrics

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26981. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23881

[jira] [Resolved] (SPARK-27035) Current time with microsecond resolution

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27035. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23945

[jira] [Assigned] (SPARK-27035) Current time with microsecond resolution

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27035: - Assignee: Maxim Gekk > Current time with microsecond resolution >

[jira] [Updated] (SPARK-27035) Current time with microsecond resolution

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27035: -- Docs Text: In Spark version 2.4 and earlier, the `current_timestamp` function returns a timestamp

[jira] [Assigned] (SPARK-26981) Add 'Recall_at_k' metric to RankingMetrics

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26981: - Assignee: Masahiro Kazama > Add 'Recall_at_k' metric to RankingMetrics >

[jira] [Resolved] (SPARK-27031) Avoid double formatting in timestampToString

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27031. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23936

[jira] [Assigned] (SPARK-27031) Avoid double formatting in timestampToString

2019-03-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27031: - Assignee: Maxim Gekk > Avoid double formatting in timestampToString >

[jira] [Commented] (SPARK-27025) Speed up toLocalIterator

2019-03-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784892#comment-16784892 ] Sean Owen commented on SPARK-27025: --- You'll want to cache() the thing you call toLocalIterator() on no

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 2.0.0

2019-03-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784914#comment-16784914 ] Sean Owen commented on SPARK-18057: --- [~skonto] go for it. I lost the context on this one but if we

[jira] [Updated] (SPARK-27060) DDL Commands are accepting Keywords like create, drop as tableName

2019-03-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27060: -- Target Version/s: (was: 2.4.0) Priority: Minor (was: Major) Fix Version/s:

[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path

2019-03-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784528#comment-16784528 ] Sean Owen commented on SPARK-26602: --- If a user adds something to the classpath, it matters to the

[jira] [Commented] (SPARK-26602) Insert into table fails after querying the UDF which is loaded with wrong hdfs path

2019-03-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784446#comment-16784446 ] Sean Owen commented on SPARK-26602: --- That sounds like user error. I'd close this as NotAProblem. It

[jira] [Commented] (SPARK-26972) Issue with CSV import and inferSchema set to true

2019-03-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784453#comment-16784453 ] Sean Owen commented on SPARK-26972: --- I haven't checked when it's case-insensitive, but to be clear:

[jira] [Updated] (SPARK-27048) A way to execute functions on Executor Startup and Executor Exit in Standalone

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27048: -- Target Version/s: (was: 2.4.0) > A way to execute functions on Executor Startup and Executor Exit

[jira] [Commented] (SPARK-26016) Encoding not working when using a map / mapPartitions call

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783878#comment-16783878 ] Sean Owen commented on SPARK-26016: --- It's "Fixed" in the sense that at least we plugged the

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783858#comment-16783858 ] Sean Owen commented on SPARK-26947: --- That doesn't sound "very big" but how big are the vectors you

[jira] [Commented] (SPARK-26016) Encoding not working when using a map / mapPartitions call

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783513#comment-16783513 ] Sean Owen commented on SPARK-26016: --- [~maxgekk] yeah I'm thinking of parts like

[jira] [Resolved] (SPARK-27046) Remove SPARK-19185 related references from documentation since its resolved

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27046. --- Resolution: Fixed Fix Version/s: 2.4.1 3.0.0 Issue resolved by pull

[jira] [Assigned] (SPARK-27046) Remove SPARK-19185 related references from documentation since its resolved

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27046: - Assignee: Gabor Somogyi > Remove SPARK-19185 related references from documentation since its

[jira] [Commented] (SPARK-26972) Issue with CSV import and inferSchema set to true

2019-03-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783440#comment-16783440 ] Sean Owen commented on SPARK-26972: --- One explanation for your comment about "multiline" support is

[jira] [Commented] (SPARK-26247) SPIP - ML Model Extension for no-Spark MLLib Online Serving

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782876#comment-16782876 ] Sean Owen commented on SPARK-26247: --- There are two issues here -- load time of the model, and scoring

[jira] [Commented] (SPARK-25130) [Python] Wrong timestamp returned by toPandas

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782846#comment-16782846 ] Sean Owen commented on SPARK-25130: --- [~maxgekk] is this likely fixed by your overhaul of time parsing?

[jira] [Resolved] (SPARK-25201) Synchronization performed on AtomicReference in LevelDB class

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25201. --- Resolution: Invalid I don't see a problem statement here > Synchronization performed on

[jira] [Commented] (SPARK-25350) Spark Serving

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782844#comment-16782844 ] Sean Owen commented on SPARK-25350: --- I think this kind of thing is great, but belongs outside the

[jira] [Resolved] (SPARK-25405) Saving RDD with new Hadoop API file as a Sequence File too restrictive

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25405. --- Resolution: Not A Problem Looks like you are using the old Mapreduce OutputFormat classes with the

[jira] [Resolved] (SPARK-25441) calculate term frequency in CountVectorizer()

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25441. --- Resolution: Won't Fix What you have there is already term frequency. If you want to normalize it to

[jira] [Resolved] (SPARK-25552) Upgrade from Spark 1.6.3 to 2.3.0 seems to make jobs use about 50% more memory

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25552. --- Resolution: Invalid This is too broad. Literally 1 things change from 1.6 to 2.3. You'd have to

[jira] [Updated] (SPARK-25466) Documentation does not specify how to set Kafka consumer cache capacity for SS

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25466: -- Labels: (was: doc ss) Priority: Minor (was: Major) Component/s: Documentation

[jira] [Commented] (SPARK-25544) Slow/failed convergence in Spark ML models due to internal predictor scaling

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782841#comment-16782841 ] Sean Owen commented on SPARK-25544: --- I think this is a reasonable change -- you can test it in a PR if

[jira] [Resolved] (SPARK-25550) [Spark Job History] Environment Page of Spark Job History UI showing wrong value for spark.ui.retainedJobs

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25550. --- Resolution: Won't Fix > [Spark Job History] Environment Page of Spark Job History UI showing wrong

[jira] [Resolved] (SPARK-25733) The method toLocalIterator() with dataframe doesn't work

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25733. --- Resolution: Cannot Reproduce I can't reproduce this; with a simple local test (and Spark unit

[jira] [Resolved] (SPARK-25562) The Spark add audit log

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25562. --- Resolution: Invalid > The Spark add audit log > --- > > Key:

[jira] [Resolved] (SPARK-25633) Performance Improvement for Drools Spark Jobs.

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25633. --- Resolution: Invalid I can't make out a specific issue here. JIRA isn't for tech support questions;

[jira] [Resolved] (SPARK-25853) Parts of spark components (DAG Visualizationand executors page) not available in Internet Explorer

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25853. --- Resolution: Won't Fix It looks like recent versions of Internet Explorer, as Edge, do support this.

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782810#comment-16782810 ] Sean Owen commented on SPARK-25863: --- Returning 0 seems like the correct thing to do, locally.

[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782808#comment-16782808 ] Sean Owen commented on SPARK-25982: --- Can you clarify with a more complete example? what is running in

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782806#comment-16782806 ] Sean Owen commented on SPARK-26555: --- To be clear, is there a data set that works only when not run in

[jira] [Commented] (SPARK-26881) Scaling issue with Gramian computation for RowMatrix: too many results sent to driver

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782803#comment-16782803 ] Sean Owen commented on SPARK-26881: --- [~gagafunctor] would you like to open a pull request? I think the

[jira] [Resolved] (SPARK-26906) Pyspark RDD Replication Potentially Not Working

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26906. --- Resolution: Cannot Reproduce > Pyspark RDD Replication Potentially Not Working >

[jira] [Resolved] (SPARK-26980) Kryo deserialization not working with KryoSerializable class

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26980. --- Resolution: Not A Problem Yes, my guess is it's because you're using Spark's Kryo and config, and

[jira] [Commented] (SPARK-26991) Investigate difference of `returnNullable` between ScalaReflection.deserializerFor and JavaTypeInference.deserializerFor

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782796#comment-16782796 ] Sean Owen commented on SPARK-26991: --- This is a philosophical question about what JIRA is for. JIRA was

[jira] [Commented] (SPARK-27025) Speed up toLocalIterator

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782789#comment-16782789 ] Sean Owen commented on SPARK-27025: --- It's an interesting question; let's break it down. Calling

[jira] [Resolved] (SPARK-26620) DataFrameReader.json and csv in Python should accept DataFrame.

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26620. --- Resolution: Not A Problem > DataFrameReader.json and csv in Python should accept DataFrame. >

[jira] [Resolved] (SPARK-26610) Fix inconsistency between toJSON Method in Python and Scala

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26610. --- Resolution: Not A Problem > Fix inconsistency between toJSON Method in Python and Scala >

[jira] [Commented] (SPARK-26016) Encoding not working when using a map / mapPartitions call

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782774#comment-16782774 ] Sean Owen commented on SPARK-26016: --- [~maxgekk] I see, but am I correct that in the text source,

[jira] [Reopened] (SPARK-26918) All .md should have ASF license header

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-26918: --- Huh, OK. I had thought all these headers were actually conveniences, and technically redundant with

[jira] [Resolved] (SPARK-27016) Treat all antlr warnings as errors while generating parser from the sql grammar file.

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27016. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23925

[jira] [Assigned] (SPARK-27016) Treat all antlr warnings as errors while generating parser from the sql grammar file.

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27016: - Assignee: Dilip Biswal > Treat all antlr warnings as errors while generating parser from the

[jira] [Resolved] (SPARK-26274) Download page must link to https://www.apache.org/dist/spark for current releases

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26274. --- Resolution: Fixed Assignee: Sean Owen Fix Version/s: 2.3.3 > Download page must

[jira] [Commented] (SPARK-26146) CSV wouln't be ingested in Spark 2.4.0 with Scala 2.12

2019-03-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782759#comment-16782759 ] Sean Owen commented on SPARK-26146: --- Wait a sec, here's the explanation:

[jira] [Updated] (SPARK-25762) Upgrade guava version in spark dependency lists due to CVE issue

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-25762: -- Target Version/s: 3.0.0 Updating Guava can be really disruptive, but we shade it, so it's possible I

[jira] [Commented] (SPARK-25781) relative importance of linear regression

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782537#comment-16782537 ] Sean Owen commented on SPARK-25781: --- PS are you referring to Shapley values? > relative importance of

[jira] [Resolved] (SPARK-25941) Random forest score decreased due to updating spark version

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25941. --- Resolution: Not A Problem This isn't a bug. The implementation has changed in Spark and MLlib a lot

[jira] [Resolved] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25814. --- Resolution: Duplicate I think there are a few related items that this could be a duplicate of;

[jira] [Resolved] (SPARK-25911) [spark-ml] Hypothesis testing module

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25911. --- Resolution: Won't Fix I don't think we'd add all of those. Some of these are already in JIRA as

[jira] [Resolved] (SPARK-25928) NoSuchMethodError net.jpountz.lz4.LZ4BlockInputStream.(Ljava/io/InputStream;Z)V

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25928. --- Resolution: Not A Problem Yeah, you have an lz4-java version conflict somewhere. That's not a Spark

<    2   3   4   5   6   7   8   9   10   11   >