[jira] [Resolved] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26807. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23933

[jira] [Assigned] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-26807: Assignee: Sean Owen > Confusing documentation regarding installation from PyPi >

[jira] [Comment Edited] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-03-01 Thread Prabhu Joseph (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782270#comment-16782270 ] Prabhu Joseph edited comment on SPARK-26867 at 3/2/19 3:52 AM: --- [~srowen]

[jira] [Commented] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-03-01 Thread Prabhu Joseph (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782270#comment-16782270 ] Prabhu Joseph commented on SPARK-26867: --- Spark can allow users to configure the Placement

[jira] [Assigned] (SPARK-26982) Enhance describe framework to describe the output of a query

2019-03-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26982: --- Assignee: Dilip Biswal > Enhance describe framework to describe the output of a query >

[jira] [Resolved] (SPARK-26982) Enhance describe framework to describe the output of a query

2019-03-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26982. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23883

[jira] [Resolved] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26492. --- Resolution: Won't Fix That's no extra info, and not a valid reason to reopen this. > support

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Martin Loncaric (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782258#comment-16782258 ] Martin Loncaric commented on SPARK-26555: - I can also replicate with different schemas

[jira] [Closed] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-26492. - > support streaming DecisionTreeRegressor > --- > >

[jira] [Commented] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread sky54521 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782251#comment-16782251 ] sky54521 commented on SPARK-26492: -- If can not be implemented, then the spark streaming is weak >

[jira] [Comment Edited] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread sky54521 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782250#comment-16782250 ] sky54521 edited comment on SPARK-26492 at 3/2/19 1:54 AM: -- Concrete implement

[jira] [Reopened] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread sky54521 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sky54521 reopened SPARK-26492: -- Concrete implement way I don't know, but I think that can be implement > support streaming

[jira] [Comment Edited] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Martin Loncaric (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782239#comment-16782239 ] Martin Loncaric edited comment on SPARK-26555 at 3/2/19 1:25 AM: - I was

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Martin Loncaric (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782239#comment-16782239 ] Martin Loncaric commented on SPARK-26555: - I was able to replicate with both all rows as

[jira] [Commented] (SPARK-24130) Data Source V2: Join Push Down

2019-03-01 Thread William Wong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782237#comment-16782237 ] William Wong commented on SPARK-24130: -- https://github.com/apache/spark/pull/22547 It seems that

[jira] [Created] (SPARK-27029) Update Thrift to 0.12.0

2019-03-01 Thread Sean Owen (JIRA)
Sean Owen created SPARK-27029: - Summary: Update Thrift to 0.12.0 Key: SPARK-27029 URL: https://issues.apache.org/jira/browse/SPARK-27029 Project: Spark Issue Type: Task Components:

[jira] [Created] (SPARK-27028) PySpark read .dat file. Multiline issue

2019-03-01 Thread alokchowdary (JIRA)
alokchowdary created SPARK-27028: Summary: PySpark read .dat file. Multiline issue Key: SPARK-27028 URL: https://issues.apache.org/jira/browse/SPARK-27028 Project: Spark Issue Type: Question

[jira] [Created] (SPARK-27027) from_avro function does not deserialize the Avro record of a struct column type correctly

2019-03-01 Thread Hien Luu (JIRA)
Hien Luu created SPARK-27027: Summary: from_avro function does not deserialize the Avro record of a struct column type correctly Key: SPARK-27027 URL: https://issues.apache.org/jira/browse/SPARK-27027

[jira] [Assigned] (SPARK-26977) Warn against subclassing scala.App doesn't work

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26977: - Assignee: Manu Zhang > Warn against subclassing scala.App doesn't work >

[jira] [Resolved] (SPARK-26977) Warn against subclassing scala.App doesn't work

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26977. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23903

[jira] [Resolved] (SPARK-26387) Parallelism seems to cause difference in CrossValidation model metrics

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26387. --- Resolution: Not A Problem It shouldn't have any effect. But, you might get different results on

[jira] [Resolved] (SPARK-26458) OneHotEncoderModel verifies the number of category values incorrectly when tries to transform a dataframe.

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26458. --- Resolution: Not A Problem I don't quite get this; it already accounts for handleInvalid and

[jira] [Resolved] (SPARK-26395) Spark Thrift server memory leak

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26395. --- Resolution: Duplicate If this isn't resolved for you 2.3.3 we can reopen, but the duplicate

[jira] [Resolved] (SPARK-26404) set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster mode.

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26404. --- Resolution: Not A Problem > set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s

[jira] [Resolved] (SPARK-26407) For an external non-partitioned table, if add a directory named with k=v to the table path, select result will be wrong

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26407. --- Resolution: Not A Problem I don't think it's reasonable to add arbitrary other dirs under this

[jira] [Commented] (SPARK-26425) Add more constraint checks in file streaming source to avoid checkpoint corruption

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782196#comment-16782196 ] Sean Owen commented on SPARK-26425: --- [~kabhwan] I think you're welcome to work on this. > Add more

[jira] [Resolved] (SPARK-26408) java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347)

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26408. --- Resolution: Not A Problem > java.util.NoSuchElementException: None.get at >

[jira] [Resolved] (SPARK-26492) support streaming DecisionTreeRegressor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26492. --- Resolution: Won't Fix I'm not even sure that can be implemented in a one-pass algorithm? You'd have

[jira] [Commented] (SPARK-26494) 【spark sql】Use spark to read oracle TIMESTAMP(6) WITH LOCAL TIME ZONE type can't be found,

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782193#comment-16782193 ] Sean Owen commented on SPARK-26494: --- Correct me if I'm wrong, but does a timestamp without local time

[jira] [Commented] (SPARK-26505) Catalog class Function is missing "database" field

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782190#comment-16782190 ] Sean Owen commented on SPARK-26505: --- Go ahead with a PR > Catalog class Function is missing

[jira] [Resolved] (SPARK-26506) RegressionMetrics fails in Spark 2.4

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26506. --- Resolution: Cannot Reproduce There's not enough info here. What's the error? these work in general

[jira] [Resolved] (SPARK-26512) Spark 2.4.0 is not working with Hadoop 2.8.3 in windows 10

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26512. --- Resolution: Cannot Reproduce > Spark 2.4.0 is not working with Hadoop 2.8.3 in windows 10 >

[jira] [Commented] (SPARK-26518) UI Application Info Race Condition Can Throw NoSuchElement

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782186#comment-16782186 ] Sean Owen commented on SPARK-26518: --- It looks easy to handle the case where applicationInfo() isn't

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Martin Loncaric (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782185#comment-16782185 ] Martin Loncaric commented on SPARK-26555: - Will try it out and report back > Thread safety

[jira] [Resolved] (SPARK-26523) Getting this error while reading from kinesis :- Could not read until the end sequence number of the range: SequenceNumberRange

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26523. --- Resolution: Not A Problem This would have to have more info or be narrowed down more to make this

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782179#comment-16782179 ] Sean Owen commented on SPARK-26555: --- Hm, I might have that backwards; might be when all are not None?

[jira] [Commented] (SPARK-27024) Design executor interface to support GPU resources

2019-03-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782167#comment-16782167 ] Thomas Graves commented on SPARK-27024: --- This and SPARK-27005 basically split the design of the

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-03-01 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782162#comment-16782162 ] Gabor Somogyi edited comment on SPARK-26727 at 3/1/19 10:21 PM: I've

[jira] [Commented] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-03-01 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782162#comment-16782162 ] Gabor Somogyi commented on SPARK-26727: --- I've seen this issue lately when I was dealing with a

[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782146#comment-16782146 ] Sean Owen commented on SPARK-26555: --- This doesn't look like a Spark bug. It comes up, I think, when

[jira] [Updated] (SPARK-26589) proper `median` method for spark dataframe

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26589: -- Priority: Minor (was: Major) Would you like to implement it? It's kind of DIY here. It's not crazy

[jira] [Updated] (SPARK-26534) Closure Cleaner Bug

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26534: -- Priority: Minor (was: Major) Yes, the closure cleaner has never been able to be 100% sure it gets

[jira] [Resolved] (SPARK-26568) Too many partitions may cause thriftServer frequently Full GC

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26568. --- Resolution: Not A Problem I don't think this is actionable; you're generally saying that the Hive

[jira] [Commented] (SPARK-26570) Out of memory when InMemoryFileIndex bulkListLeafFiles

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782137#comment-16782137 ] Sean Owen commented on SPARK-26570: --- How big can these be? are you saying they're large, or that they

[jira] [Commented] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode

2019-03-01 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782135#comment-16782135 ] Gabor Somogyi commented on SPARK-26998: --- How is this different from

[jira] [Resolved] (SPARK-26587) Deadlock between SparkUI thread and Driver thread

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26587. --- Resolution: Duplicate > Deadlock between SparkUI thread and Driver thread >

[jira] [Resolved] (SPARK-26602) Once creating and quering udf with incorrect path,followed by querying tables or functions registered with correct path gives the runtime exception within the same sess

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26602. --- Resolution: Duplicate > Once creating and quering udf with incorrect path,followed by querying

[jira] [Comment Edited] (SPARK-26991) Investigate difference of `returnNullable` between ScalaReflection.deserializerFor and JavaTypeInference.deserializerFor

2019-03-01 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782128#comment-16782128 ] Jungtaek Lim edited comment on SPARK-26991 at 3/1/19 9:53 PM: -- The outcome

[jira] [Updated] (SPARK-26770) Misleading/unhelpful error message when wrapping a null in an Option

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26770: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) I'm not sure; it's a user

[jira] [Resolved] (SPARK-26623) Need a transpose function

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26623. --- Resolution: Won't Fix Yeah, I've never heard of this type of function. The use cases I can think of

[jira] [Commented] (SPARK-26991) Investigate difference of `returnNullable` between ScalaReflection.deserializerFor and JavaTypeInference.deserializerFor

2019-03-01 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782128#comment-16782128 ] Jungtaek Lim commented on SPARK-26991: -- The outcome would be * "invalid" if there's reasonable

[jira] [Resolved] (SPARK-26624) Different classloader use on subsequent call to same query, causing different behavior

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26624. --- Resolution: Duplicate > Different classloader use on subsequent call to same query, causing

[jira] [Resolved] (SPARK-26693) Large Numbers Truncated

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26693. --- Resolution: Cannot Reproduce Assuming the zeppelin interpretation is correct for now > Large

[jira] [Resolved] (SPARK-26701) spark thrift server driver memory leak

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26701. --- Resolution: Duplicate > spark thrift server driver memory leak >

[jira] [Updated] (SPARK-26736) if filter condition has rand() function it does not do partition prunning

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26736: -- Priority: Minor (was: Major) > if filter condition has rand() function it does not do partition

[jira] [Resolved] (SPARK-26791) Some scala codes doesn't show friendly and some description about foreachBatch is misleading

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26791. --- Resolution: Not A Problem I read the docs and am not clear what the issue is. The scala code is

[jira] [Commented] (SPARK-26883) Spark MLIB Logistic Regression with heavy class imbalance estimates 0 coefficients

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782099#comment-16782099 ] Sean Owen commented on SPARK-26883: --- My guess: something goes wrong when a partition has 0 of the

[jira] [Assigned] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26807: Assignee: Apache Spark > Confusing documentation regarding installation from PyPi >

[jira] [Assigned] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26807: Assignee: (was: Apache Spark) > Confusing documentation regarding installation from

[jira] [Updated] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26807: -- Priority: Trivial (was: Minor) I'll just fix it, to expedite. > Confusing documentation regarding

[jira] [Resolved] (SPARK-26815) run command "Spark-shell --proxy-user " failed in kerberos environment

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26815. --- Resolution: Cannot Reproduce > run command "Spark-shell --proxy-user " failed in kerberos

[jira] [Resolved] (SPARK-26828) Coalesce to reduce partitions before writing to hive is not working

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26828. --- Resolution: Cannot Reproduce Not enough info here. > Coalesce to reduce partitions before writing

[jira] [Resolved] (SPARK-26943) Weird behaviour with `.cache()`

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26943. --- Resolution: Cannot Reproduce I don't think this is a bug, or at least, I can think of other reasons

[jira] [Resolved] (SPARK-26829) In place standard scaler so the column remains same after transformation

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26829. --- Resolution: Won't Fix > In place standard scaler so the column remains same after transformation >

[jira] [Commented] (SPARK-26863) Add minimal values for spark.driver.memory and spark.executor.memory

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782106#comment-16782106 ] Sean Owen commented on SPARK-26863: --- Sure but is it really a comment about the default, or limits on

[jira] [Commented] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782105#comment-16782105 ] Sean Owen commented on SPARK-26867: --- How would Spark use it? > Spark Support of YARN Placement

[jira] [Resolved] (SPARK-26876) Spark repl scala test failure on big-endian system

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26876. --- Resolution: Duplicate Thought not the exact same issue, this is basically the same question, and

[jira] [Updated] (SPARK-26885) Remove yyyy/yyyy-[d]d format in DataTimeUtils for stringToTimestamp and stringToDate

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26885: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Remove /-[d]d

[jira] [Commented] (SPARK-26896) Add maven profiles for running tests with JDK 11

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782096#comment-16782096 ] Sean Owen commented on SPARK-26896: --- Reproducing my comments from elsewhere: I don't think we want to

[jira] [Resolved] (SPARK-26899) CountMinSketchAgg ExpressionDescription is not so correct

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26899. --- Resolution: Not A Problem This isn't "Major", and I don't think it's a doc problem in a comment,

[jira] [Commented] (SPARK-26906) Pyspark RDD Replication Potentially Not Working

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782093#comment-16782093 ] Sean Owen commented on SPARK-26906: --- I can't reproduce this on 2.4.0. It shows "2x replicated". What

[jira] [Commented] (SPARK-26944) Python unit-tests.log not available in artifacts for a build in Jenkins

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782085#comment-16782085 ] Sean Owen commented on SPARK-26944: --- I can see things like

[jira] [Commented] (SPARK-26980) Kryo deserialization not working with KryoSerializable class

2019-03-01 Thread Alexis Sarda-Espinosa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782082#comment-16782082 ] Alexis Sarda-Espinosa commented on SPARK-26980: --- Kryo works fine if used directly

[jira] [Assigned] (SPARK-27026) Upgrade Docker image for release build to Ubuntu 18.04

2019-03-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27026: Assignee: (was: Apache Spark) > Upgrade Docker image for release build to Ubuntu

[jira] [Resolved] (SPARK-26984) Incompatibility between Spark releases - Some(null)

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26984. --- Resolution: Not A Problem I agree, I don't think "Some(null)" is reasonable. You mean None, right?

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782076#comment-16782076 ] Sean Owen commented on SPARK-26947: --- How big is k? yes, you're going to run out of memory eventually

[jira] [Resolved] (SPARK-26951) Should not throw KryoException when root cause is IOexception

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26951. --- Resolution: Not A Problem Agree, I don't see a reason to retry in this case > Should not throw

[jira] [Updated] (SPARK-26970) Can't load PipelineModel that was created in Scala with Python due to missing Interaction transformer

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26970: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Can't load PipelineModel

[jira] [Resolved] (SPARK-26972) Issue with CSV import and inferSchema set to true

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26972. --- Resolution: Not A Problem The option is "multiLine" > Issue with CSV import and inferSchema set to

[jira] [Assigned] (SPARK-27026) Upgrade Docker image for release build to Ubuntu 18.04

2019-03-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27026: Assignee: Apache Spark > Upgrade Docker image for release build to Ubuntu 18.04 >

[jira] [Commented] (SPARK-26980) Kryo deserialization not working with KryoSerializable class

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782072#comment-16782072 ] Sean Owen commented on SPARK-26980: --- This sounds like a Kryo usage question. It could have something

[jira] [Updated] (SPARK-26982) Enhance describe framework to describe the output of a query

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26982: -- Priority: Minor (was: Major) > Enhance describe framework to describe the output of a query >

[jira] [Updated] (SPARK-26983) Spark PassThroughSuite,ColumnVectorSuite failure on bigendian

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26983: -- Target Version/s: (was: 2.3.2) Priority: Minor (was: Major) Fix Version/s:

[jira] [Resolved] (SPARK-26985) Test "access only some column of the all of columns " fails on big endian

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26985. --- Resolution: Not A Problem Same as SPARK-26940; until it's reproducible on a standard JDK, I don't

[jira] [Commented] (SPARK-26991) Investigate difference of `returnNullable` between ScalaReflection.deserializerFor and JavaTypeInference.deserializerFor

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782062#comment-16782062 ] Sean Owen commented on SPARK-26991: --- I'm not sure what the outcome would be here; let's open JIRAs for

[jira] [Created] (SPARK-27026) Upgrade Docker image for release build to Ubuntu 18.04

2019-03-01 Thread DB Tsai (JIRA)
DB Tsai created SPARK-27026: --- Summary: Upgrade Docker image for release build to Ubuntu 18.04 Key: SPARK-27026 URL: https://issues.apache.org/jira/browse/SPARK-27026 Project: Spark Issue Type:

[jira] [Updated] (SPARK-27011) reset command fails after cache table

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27011: -- Priority: Minor (was: Critical) > reset command fails after cache table >

[jira] [Commented] (SPARK-27024) Design executor interface to support GPU resources

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782060#comment-16782060 ] Sean Owen commented on SPARK-27024: --- Is this different from SPARK-27005? > Design executor interface

[jira] [Commented] (SPARK-27015) spark-submit does not properly escape arguments sent to Mesos dispatcher

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782059#comment-16782059 ] Sean Owen commented on SPARK-27015: --- Sure, open a PR to escape the args as needed. > spark-submit

[jira] [Updated] (SPARK-27014) Support removal of jars and Spark binaries from Mesos driver and executor sandboxes

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-27014: -- Priority: Minor (was: Major) > Support removal of jars and Spark binaries from Mesos driver and

[jira] [Commented] (SPARK-27025) Speed up toLocalIterator

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782051#comment-16782051 ] Sean Owen commented on SPARK-27025: --- If you fetched it all at once proactively, you have another

[jira] [Updated] (SPARK-26326) Cannot save a NaiveBayesModel with 48685 features and 5453 labels

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26326: -- Priority: Minor (was: Major) Yeah, this means you have a model with about 265M parameters, and when

[jira] [Commented] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2019-03-01 Thread Reza Safi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782046#comment-16782046 ] Reza Safi commented on SPARK-26357: --- I think the primary usecase is for those users who are depending

[jira] [Updated] (SPARK-26373) Spark UI 'environment' tab - column to indicate default vs overridden values

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26373: -- Priority: Minor (was: Major) > Spark UI 'environment' tab - column to indicate default vs overridden

[jira] [Resolved] (SPARK-26347) MergeAggregate serialize and deserialize funcition can use ByteBuffer to opimize

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26347. --- Resolution: Won't Fix > MergeAggregate serialize and deserialize funcition can use ByteBuffer to >

[jira] [Commented] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782039#comment-16782039 ] Sean Owen commented on SPARK-26357: --- What is the use case for this? > Expose executors' procfs

[jira] [Resolved] (SPARK-26358) Spark deployed mode question

2019-03-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26358. --- Resolution: Invalid There isn't enough detail here. It can be reopened if you can, ideally, provide

[jira] [Resolved] (SPARK-26997) k8s integration tests failing after client upgraded to 4.1.2

2019-03-01 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26997. Resolution: Fixed Target Version/s: 3.0.0 I reverted the client upgrade and

[jira] [Assigned] (SPARK-26048) Flume connector for Spark 2.4 does not exist in Maven repository

2019-03-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26048: Assignee: (was: Apache Spark) > Flume connector for Spark 2.4 does not exist in

[jira] [Assigned] (SPARK-26048) Flume connector for Spark 2.4 does not exist in Maven repository

2019-03-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26048: Assignee: Apache Spark > Flume connector for Spark 2.4 does not exist in Maven

  1   2   >