[jira] [Commented] (SPARK-26146) CSV wouln't be ingested in Spark 2.4.0 with Scala 2.12

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782526#comment-16782526 ] Sean Owen commented on SPARK-26146: --- It's possible, even likely, that the dependency is transitive. In

[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782521#comment-16782521 ] Sean Owen commented on SPARK-25982: --- I don't understand this; you're running operations in parallel on

[jira] [Commented] (SPARK-26016) Encoding not working when using a map / mapPartitions call

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782520#comment-16782520 ] Sean Owen commented on SPARK-26016: --- Yeah, the TL;DR from reading the code is that it writes UTF-8,

[jira] [Commented] (SPARK-26045) Error in the spark 2.4 release package with the spark-avro_2.11 depdency

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782508#comment-16782508 ] Sean Owen commented on SPARK-26045: --- Avro is already on 1.8.2 in 2.4.0. I think the problem may be

[jira] [Resolved] (SPARK-26017) SVD++ error rate is high in the test suite.

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26017. --- Resolution: Invalid I don't think this states an actionable issue; high relative to what? > SVD++

[jira] [Commented] (SPARK-26166) CrossValidator.fit() bug,training and validation dataset may overlap

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782502#comment-16782502 ] Sean Owen commented on SPARK-26166: --- I agree there's a problem there. I don't think checkpoint() is

[jira] [Updated] (SPARK-26058) Incorrect logging class loaded for all the logs.

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26058: -- Priority: Minor (was: Major) Description: In order to make the bug more evident, please

[jira] [Resolved] (SPARK-26128) filter breaks input_file_name

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26128. --- Resolution: Cannot Reproduce > filter breaks input_file_name > - > >

[jira] [Resolved] (SPARK-26146) CSV wouln't be ingested in Spark 2.4.0 with Scala 2.12

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26146. --- Resolution: Cannot Reproduce Is that code from Spark? it doesn't quite look like it from the stack

[jira] [Commented] (SPARK-26152) Flaky test: BroadcastSuite

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782505#comment-16782505 ] Sean Owen commented on SPARK-26152: --- I haven't seen this one in a while, FWIW. > Flaky test:

[jira] [Resolved] (SPARK-26162) ALS results vary with user or item ID encodings

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26162. --- Resolution: Not A Problem I don't think it's a bug; the distributed computation and blocking might

[jira] [Resolved] (SPARK-26183) ConcurrentModificationException when using Spark collectionAccumulator

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26183. --- Resolution: Not A Problem This doesn't look like a bug. It arises because you are (inadvertently)

[jira] [Resolved] (SPARK-26214) Add "broadcast" method to DataFrame

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26214. --- Resolution: Won't Fix > Add "broadcast" method to DataFrame > --- >

[jira] [Commented] (SPARK-26224) Results in stackOverFlowError when trying to add 3000 new columns using withColumn function of dataframe.

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782496#comment-16782496 ] Sean Owen commented on SPARK-26224: --- The most realistic thing I can imagine is exposing `withColumns`.

[jira] [Resolved] (SPARK-26229) Expose SizeEstimator as a developer API in pyspark

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26229. --- Resolution: Won't Fix It's pretty internal. I don't think we'd want to expose it further,

[jira] [Commented] (SPARK-26247) SPIP - ML Model Extension for no-Spark MLLib Online Serving

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782494#comment-16782494 ] Sean Owen commented on SPARK-26247: --- I tend to think of this problem as about as solved as it will be

[jira] [Commented] (SPARK-26257) SPIP: Interop Support for Spark Language Extensions

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782491#comment-16782491 ] Sean Owen commented on SPARK-26257: --- I'm not sure how much of the Pyspark and SparkR integration is

[jira] [Commented] (SPARK-26261) Spark does not check completeness temporary file

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782489#comment-16782489 ] Sean Owen commented on SPARK-26261: --- Why would you truncate the file -- how would this happen

[jira] [Resolved] (SPARK-26266) Update to Scala 2.12.8

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26266. --- Resolution: Fixed Assignee: Sean Owen (was: Yuming Wang) Fix Version/s: 3.0.0

[jira] [Resolved] (SPARK-26272) Please delete old releases from mirroring system

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26272. --- Resolution: Fixed Fix Version/s: 2.3.3 Yep, we do this regularly, and do it alongside

[jira] [Commented] (SPARK-26943) Weird behaviour with `.cache()`

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782429#comment-16782429 ] Sean Owen commented on SPARK-26943: --- You can download and run the Spark distro wherever you like. If

[jira] [Updated] (SPARK-26274) Download page must link to https://www.apache.org/dist/spark for current releases

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26274: -- Priority: Minor (was: Major) Component/s: (was: Web UI) (was: Deploy)

[jira] [Resolved] (SPARK-27007) add rawPrediction to OneVsRest in PySpark

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27007. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23910

[jira] [Assigned] (SPARK-27007) add rawPrediction to OneVsRest in PySpark

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-27007: - Assignee: Huaxin Gao > add rawPrediction to OneVsRest in PySpark >

[jira] [Resolved] (SPARK-26860) RangeBetween docs appear to be wrong

2019-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26860. --- Resolution: Not A Problem > RangeBetween docs appear to be wrong >

[jira] [Created] (SPARK-27032) Flaky test: org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.HDFSMetadataLog: metadata directory collision

2019-03-02 Thread Sean Owen (JIRA)
Sean Owen created SPARK-27032: - Summary: Flaky test: org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.HDFSMetadataLog: metadata directory collision Key: SPARK-27032 URL:

[jira] [Resolved] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26492. --- Resolution: Won't Fix That's no extra info, and not a valid reason to reopen this. > support

[jira] [Closed] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-26492. - > support streaming DecisionTreeRegressor > --- > >

[jira] [Created] (SPARK-27029) Update Thrift to 0.12.0

2019-03-01 Thread Sean Owen (JIRA)
Sean Owen created SPARK-27029: - Summary: Update Thrift to 0.12.0 Key: SPARK-27029 URL: https://issues.apache.org/jira/browse/SPARK-27029 Project: Spark Issue Type: Task Components:

[jira] [Assigned] (SPARK-26977) Warn against subclassing scala.App doesn't work

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26977: - Assignee: Manu Zhang > Warn against subclassing scala.App doesn't work >

[jira] [Resolved] (SPARK-26977) Warn against subclassing scala.App doesn't work

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26977. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23903

[jira] [Resolved] (SPARK-26387) Parallelism seems to cause difference in CrossValidation model metrics

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26387. --- Resolution: Not A Problem It shouldn't have any effect. But, you might get different results on

[jira] [Resolved] (SPARK-26458) OneHotEncoderModel verifies the number of category values incorrectly when tries to transform a dataframe.

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26458. --- Resolution: Not A Problem I don't quite get this; it already accounts for handleInvalid and

[jira] [Resolved] (SPARK-26395) Spark Thrift server memory leak

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26395. --- Resolution: Duplicate If this isn't resolved for you 2.3.3 we can reopen, but the duplicate

[jira] [Resolved] (SPARK-26404) set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster mode.

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26404. --- Resolution: Not A Problem > set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s

[jira] [Resolved] (SPARK-26407) For an external non-partitioned table, if add a directory named with k=v to the table path, select result will be wrong

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26407. --- Resolution: Not A Problem I don't think it's reasonable to add arbitrary other dirs under this

[jira] [Commented] (SPARK-26425) Add more constraint checks in file streaming source to avoid checkpoint corruption

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782196#comment-16782196 ] Sean Owen commented on SPARK-26425: --- [~kabhwan] I think you're welcome to work on this. > Add more

[jira] [Resolved] (SPARK-26408) java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347)

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26408. --- Resolution: Not A Problem > java.util.NoSuchElementException: None.get at >

[jira] [Resolved] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26492. --- Resolution: Won't Fix I'm not even sure that can be implemented in a one-pass algorithm? You'd have

[jira] [Commented] (SPARK-26494) 【spark sql】Use spark to read oracle TIMESTAMP(6) WITH LOCAL TIME ZONE type can't be found,

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782193#comment-16782193 ] Sean Owen commented on SPARK-26494: --- Correct me if I'm wrong, but does a timestamp without local time

[jira] [Commented] (SPARK-26505) Catalog class Function is missing "database" field

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782190#comment-16782190 ] Sean Owen commented on SPARK-26505: --- Go ahead with a PR > Catalog class Function is missing

[jira] [Resolved] (SPARK-26506) RegressionMetrics fails in Spark 2.4

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26506. --- Resolution: Cannot Reproduce There's not enough info here. What's the error? these work in general

[jira] [Resolved] (SPARK-26512) Spark 2.4.0 is not working with Hadoop 2.8.3 in windows 10

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26512. --- Resolution: Cannot Reproduce > Spark 2.4.0 is not working with Hadoop 2.8.3 in windows 10 >

[jira] [Commented] (SPARK-26518) UI Application Info Race Condition Can Throw NoSuchElement

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782186#comment-16782186 ] Sean Owen commented on SPARK-26518: --- It looks easy to handle the case where applicationInfo() isn't

[jira] [Resolved] (SPARK-26523) Getting this error while reading from kinesis :- Could not read until the end sequence number of the range: SequenceNumberRange

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26523. --- Resolution: Not A Problem This would have to have more info or be narrowed down more to make this

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782179#comment-16782179 ] Sean Owen commented on SPARK-26555: --- Hm, I might have that backwards; might be when all are not None?

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782146#comment-16782146 ] Sean Owen commented on SPARK-26555: --- This doesn't look like a Spark bug. It comes up, I think, when

[jira] [Updated] (SPARK-26589) proper `median` method for spark dataframe

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26589: -- Priority: Minor (was: Major) Would you like to implement it? It's kind of DIY here. It's not crazy

[jira] [Updated] (SPARK-26534) Closure Cleaner Bug

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26534: -- Priority: Minor (was: Major) Yes, the closure cleaner has never been able to be 100% sure it gets

[jira] [Resolved] (SPARK-26568) Too many partitions may cause thriftServer frequently Full GC

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26568. --- Resolution: Not A Problem I don't think this is actionable; you're generally saying that the Hive

[jira] [Commented] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782137#comment-16782137 ] Sean Owen commented on SPARK-26570: --- How big can these be? are you saying they're large, or that they

[jira] [Resolved] (SPARK-26587) Deadlock between SparkUI thread and Driver thread

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26587. --- Resolution: Duplicate > Deadlock between SparkUI thread and Driver thread >

[jira] [Resolved] (SPARK-26602) Once creating and quering udf with incorrect path,followed by querying tables or functions registered with correct path gives the runtime exception within the same sess

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26602. --- Resolution: Duplicate > Once creating and quering udf with incorrect path,followed by querying

[jira] [Updated] (SPARK-26770) Misleading/unhelpful error message when wrapping a null in an Option

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26770: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) I'm not sure; it's a user

[jira] [Resolved] (SPARK-26623) Need a transpose function

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26623. --- Resolution: Won't Fix Yeah, I've never heard of this type of function. The use cases I can think of

[jira] [Resolved] (SPARK-26624) Different classloader use on subsequent call to same query, causing different behavior

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26624. --- Resolution: Duplicate > Different classloader use on subsequent call to same query, causing

[jira] [Resolved] (SPARK-26693) Large Numbers Truncated

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26693. --- Resolution: Cannot Reproduce Assuming the zeppelin interpretation is correct for now > Large

[jira] [Resolved] (SPARK-26701) spark thrift server driver memory leak

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26701. --- Resolution: Duplicate > spark thrift server driver memory leak >

[jira] [Updated] (SPARK-26736) if filter condition has rand() function it does not do partition prunning

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26736: -- Priority: Minor (was: Major) > if filter condition has rand() function it does not do partition

[jira] [Resolved] (SPARK-26791) Some scala codes doesn't show friendly and some description about foreachBatch is misleading

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26791. --- Resolution: Not A Problem I read the docs and am not clear what the issue is. The scala code is

[jira] [Commented] (SPARK-26883) Spark MLIB Logistic Regression with heavy class imbalance estimates 0 coefficients

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782099#comment-16782099 ] Sean Owen commented on SPARK-26883: --- My guess: something goes wrong when a partition has 0 of the

[jira] [Updated] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26807: -- Priority: Trivial (was: Minor) I'll just fix it, to expedite. > Confusing documentation regarding

[jira] [Resolved] (SPARK-26815) run command "Spark-shell --proxy-user " failed in kerberos environment

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26815. --- Resolution: Cannot Reproduce > run command "Spark-shell --proxy-user " failed in kerberos

[jira] [Resolved] (SPARK-26828) Coalesce to reduce partitions before writing to hive is not working

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26828. --- Resolution: Cannot Reproduce Not enough info here. > Coalesce to reduce partitions before writing

[jira] [Resolved] (SPARK-26943) Weird behaviour with `.cache()`

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26943. --- Resolution: Cannot Reproduce I don't think this is a bug, or at least, I can think of other reasons

[jira] [Resolved] (SPARK-26829) In place standard scaler so the column remains same after transformation

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26829. --- Resolution: Won't Fix > In place standard scaler so the column remains same after transformation >

[jira] [Commented] (SPARK-26863) Add minimal values for spark.driver.memory and spark.executor.memory

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782106#comment-16782106 ] Sean Owen commented on SPARK-26863: --- Sure but is it really a comment about the default, or limits on

[jira] [Commented] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782105#comment-16782105 ] Sean Owen commented on SPARK-26867: --- How would Spark use it? > Spark Support of YARN Placement

[jira] [Resolved] (SPARK-26876) Spark repl scala test failure on big-endian system

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26876. --- Resolution: Duplicate Thought not the exact same issue, this is basically the same question, and

[jira] [Updated] (SPARK-26885) Remove yyyy/yyyy-[d]d format in DataTimeUtils for stringToTimestamp and stringToDate

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26885: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Remove /-[d]d

[jira] [Commented] (SPARK-26896) Add maven profiles for running tests with JDK 11

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782096#comment-16782096 ] Sean Owen commented on SPARK-26896: --- Reproducing my comments from elsewhere: I don't think we want to

[jira] [Resolved] (SPARK-26899) CountMinSketchAgg ExpressionDescription is not so correct

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26899. --- Resolution: Not A Problem This isn't "Major", and I don't think it's a doc problem in a comment,

[jira] [Commented] (SPARK-26906) Pyspark RDD Replication Potentially Not Working

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782093#comment-16782093 ] Sean Owen commented on SPARK-26906: --- I can't reproduce this on 2.4.0. It shows "2x replicated". What

[jira] [Commented] (SPARK-26944) Python unit-tests.log not available in artifacts for a build in Jenkins

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782085#comment-16782085 ] Sean Owen commented on SPARK-26944: --- I can see things like

[jira] [Resolved] (SPARK-26984) Incompatibility between Spark releases - Some(null)

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26984. --- Resolution: Not A Problem I agree, I don't think "Some(null)" is reasonable. You mean None, right?

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782076#comment-16782076 ] Sean Owen commented on SPARK-26947: --- How big is k? yes, you're going to run out of memory eventually

[jira] [Resolved] (SPARK-26951) Should not throw KryoException when root cause is IOexception

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26951. --- Resolution: Not A Problem Agree, I don't see a reason to retry in this case > Should not throw

[jira] [Updated] (SPARK-26970) Can't load PipelineModel that was created in Scala with Python due to missing Interaction transformer

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26970: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Can't load PipelineModel

[jira] [Resolved] (SPARK-26972) Issue with CSV import and inferSchema set to true

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26972. --- Resolution: Not A Problem The option is "multiLine" > Issue with CSV import and inferSchema set to

[jira] [Commented] (SPARK-26980) Kryo deserialization not working with KryoSerializable class

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782072#comment-16782072 ] Sean Owen commented on SPARK-26980: --- This sounds like a Kryo usage question. It could have something

[jira] [Updated] (SPARK-26982) Enhance describe framework to describe the output of a query

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26982: -- Priority: Minor (was: Major) > Enhance describe framework to describe the output of a query >

[jira] [Updated] (SPARK-26983) Spark PassThroughSuite,ColumnVectorSuite failure on bigendian

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26983: -- Target Version/s: (was: 2.3.2) Priority: Minor (was: Major) Fix Version/s:

[jira] [Resolved] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26985. --- Resolution: Not A Problem Same as SPARK-26940; until it's reproducible on a standard JDK, I don't

[jira] [Commented] (SPARK-26991) Investigate difference of `returnNullable` between ScalaReflection.deserializerFor and JavaTypeInference.deserializerFor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782062#comment-16782062 ] Sean Owen commented on SPARK-26991: --- I'm not sure what the outcome would be here; let's open JIRAs for

[jira] [Updated] (SPARK-27011) reset command fails after cache table

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27011: -- Priority: Minor (was: Critical) > reset command fails after cache table >

[jira] [Commented] (SPARK-27024) Design executor interface to support GPU resources

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782060#comment-16782060 ] Sean Owen commented on SPARK-27024: --- Is this different from SPARK-27005? > Design executor interface

[jira] [Commented] (SPARK-27015) spark-submit does not properly escape arguments sent to Mesos dispatcher

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782059#comment-16782059 ] Sean Owen commented on SPARK-27015: --- Sure, open a PR to escape the args as needed. > spark-submit

[jira] [Updated] (SPARK-27014) Support removal of jars and Spark binaries from Mesos driver and executor sandboxes

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27014: -- Priority: Minor (was: Major) > Support removal of jars and Spark binaries from Mesos driver and

[jira] [Commented] (SPARK-27025) Speed up toLocalIterator

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782051#comment-16782051 ] Sean Owen commented on SPARK-27025: --- If you fetched it all at once proactively, you have another

[jira] [Updated] (SPARK-26326) Cannot save a NaiveBayesModel with 48685 features and 5453 labels

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26326: -- Priority: Minor (was: Major) Yeah, this means you have a model with about 265M parameters, and when

[jira] [Updated] (SPARK-26373) Spark UI 'environment' tab - column to indicate default vs overridden values

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26373: -- Priority: Minor (was: Major) > Spark UI 'environment' tab - column to indicate default vs overridden

[jira] [Resolved] (SPARK-26347) MergeAggregate serialize and deserialize funcition can use ByteBuffer to opimize

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26347. --- Resolution: Won't Fix > MergeAggregate serialize and deserialize funcition can use ByteBuffer to >

[jira] [Commented] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782039#comment-16782039 ] Sean Owen commented on SPARK-26357: --- What is the use case for this? > Expose executors' procfs

[jira] [Resolved] (SPARK-26358) Spark deployed mode question

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26358. --- Resolution: Invalid There isn't enough detail here. It can be reopened if you can, ideally, provide

[jira] [Resolved] (SPARK-26967) Put MetricsSystem instance names together for clearer management

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26967. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23869

[jira] [Assigned] (SPARK-26967) Put MetricsSystem instance names together for clearer management

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26967: - Assignee: SongYadong > Put MetricsSystem instance names together for clearer management >

[jira] [Commented] (SPARK-26839) on JDK11, IsolatedClientLoader must be able to load java.sql classes

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781903#comment-16781903 ] Sean Owen commented on SPARK-26839: --- Just sharing where I have gotten to with this as well: I think we

[jira] [Resolved] (SPARK-27003) [Spark]Message display at console is not correct for spark.executor.instances

2019-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-27003. --- Resolution: Not A Problem > [Spark]Message display at console is not correct for

[jira] [Resolved] (SPARK-19465) Support custom Boolean values in CSV

2019-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19465. --- Resolution: Won't Fix > Support custom Boolean values in CSV >

[jira] [Commented] (SPARK-26839) on JDK11, IsolatedClientLoader must be able to load java.sql classes

2019-02-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780085#comment-16780085 ] Sean Owen commented on SPARK-26839: --- Sure. It may be that we need to access the platform classloader

<    3   4   5   6   7   8   9   10   11   12   >