[jira] [Commented] (SPARK-25732) Allow specifying a keytab/principal for proxy user for token renewal

2018-10-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16649917#comment-16649917 ] Marco Gaido commented on SPARK-25732: - cc [~vanzin] [~tgraves] [~jerryshao] [~mridul

[jira] [Commented] (SPARK-25728) SPIP: Structured Intermediate Representation (Tungsten IR) for generating Java code

2018-10-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651480#comment-16651480 ] Marco Gaido commented on SPARK-25728: - Thanks [~kiszk]. I will check it ASAP, thanks

[jira] [Commented] (SPARK-25732) Allow specifying a keytab/principal for proxy user for token renewal

2018-10-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651800#comment-16651800 ] Marco Gaido commented on SPARK-25732: - [~tgraves] I think they can be reused, the po

[jira] [Commented] (SPARK-25732) Allow specifying a keytab/principal for proxy user for token renewal

2018-10-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651860#comment-16651860 ] Marco Gaido commented on SPARK-25732: - [~tgraves] yes, exactly it is what I am refer

[jira] [Created] (SPARK-25758) Deprecate BisectingKMeans compute cost

2018-10-17 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25758: --- Summary: Deprecate BisectingKMeans compute cost Key: SPARK-25758 URL: https://issues.apache.org/jira/browse/SPARK-25758 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-25758) Deprecate BisectingKMeans compute cost

2018-10-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653641#comment-16653641 ] Marco Gaido commented on SPARK-25758: - cc [~cloud_fan] [~srowen] [~holdenkarau]. Thi

[jira] [Created] (SPARK-25764) Avoid usage of deprecated methods in examples for BisectingKMeans

2018-10-18 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25764: --- Summary: Avoid usage of deprecated methods in examples for BisectingKMeans Key: SPARK-25764 URL: https://issues.apache.org/jira/browse/SPARK-25764 Project: Spark

[jira] [Created] (SPARK-25765) Add trainingCost to BisectingKMeans summary

2018-10-18 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25765: --- Summary: Add trainingCost to BisectingKMeans summary Key: SPARK-25765 URL: https://issues.apache.org/jira/browse/SPARK-25765 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655366#comment-16655366 ] Marco Gaido commented on SPARK-25767: - I tried on current master branch but I wasn't

[jira] [Commented] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655418#comment-16655418 ] Marco Gaido commented on SPARK-25767: - It is interesting, I can reproduce with the J

[jira] [Commented] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655440#comment-16655440 ] Marco Gaido commented on SPARK-25767: - So I tracked down the issue. The problem is t

[jira] [Commented] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655530#comment-16655530 ] Marco Gaido commented on SPARK-25767: - Your conversion of a Java array in a Scala Se

[jira] [Commented] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2018-10-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656861#comment-16656861 ] Marco Gaido commented on SPARK-25767: - I think it is a bug (thanks for reporting thi

[jira] [Commented] (SPARK-25829) Duplicated map keys are not handled consistently

2018-10-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663553#comment-16663553 ] Marco Gaido commented on SPARK-25829: - I think the main issue is that since this is

[jira] [Created] (SPARK-25838) Remove formatVersion from Saveable

2018-10-25 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25838: --- Summary: Remove formatVersion from Saveable Key: SPARK-25838 URL: https://issues.apache.org/jira/browse/SPARK-25838 Project: Spark Issue Type: Task C

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2018-10-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1876#comment-1876 ] Marco Gaido commented on SPARK-25863: - [~Tagar] thanks for reporting this. May you p

[jira] [Updated] (SPARK-25866) Update KMeans formatVersion

2018-10-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25866: Priority: Minor (was: Major) > Update KMeans formatVersion > --- > >

[jira] [Updated] (SPARK-25866) Update KMeans formatVersion

2018-10-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25866: Issue Type: Bug (was: Task) > Update KMeans formatVersion > --- > >

[jira] [Created] (SPARK-25866) Update KMeans formatVersion

2018-10-29 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25866: --- Summary: Update KMeans formatVersion Key: SPARK-25866 URL: https://issues.apache.org/jira/browse/SPARK-25866 Project: Spark Issue Type: Task Componen

[jira] [Created] (SPARK-25867) Remove KMeans computeCost

2018-10-29 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25867: --- Summary: Remove KMeans computeCost Key: SPARK-25867 URL: https://issues.apache.org/jira/browse/SPARK-25867 Project: Spark Issue Type: Task Components

[jira] [Commented] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667346#comment-16667346 ] Marco Gaido commented on SPARK-25870: - Why do you consider this a bug? They are 2 di

[jira] [Commented] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668313#comment-16668313 ] Marco Gaido commented on SPARK-25870: - If you do some transformations (simple or com

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668438#comment-16668438 ] Marco Gaido commented on SPARK-25863: - [~Tagar] thanks. ??not sure yet as it might

[jira] [Commented] (SPARK-25441) calculate term frequency in CountVectorizer()

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668654#comment-16668654 ] Marco Gaido commented on SPARK-25441: - TF has an appropriate transformer. I think th

[jira] [Commented] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669037#comment-16669037 ] Marco Gaido commented on SPARK-25870: - Thanks [~deacuna]. > RandomSplit with seed g

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674827#comment-16674827 ] Marco Gaido commented on SPARK-24437: - Hi [~dvogelbacher], thanks for you comment an

[jira] [Commented] (SPARK-25650) Make analyzer rules used in once-policy idempotent

2018-11-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674955#comment-16674955 ] Marco Gaido commented on SPARK-25650: - [~maryannxue] since all the subtasks are comp

[jira] [Commented] (SPARK-25650) Make analyzer rules used in once-policy idempotent

2018-11-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674954#comment-16674954 ] Marco Gaido commented on SPARK-25650: - [~maryannxue] since all the subtasks are comp

[jira] [Issue Comment Deleted] (SPARK-25650) Make analyzer rules used in once-policy idempotent

2018-11-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25650: Comment: was deleted (was: [~maryannxue] since all the subtasks are completed, shall we close this

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675353#comment-16675353 ] Marco Gaido commented on SPARK-24437: - [~eyalfa] yes, that is the point, if there is

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-11-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679745#comment-16679745 ] Marco Gaido commented on SPARK-24437: - [~dvogelbacher] the point is: a broadcast is

[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356771#comment-16356771 ] Marco Gaido commented on SPARK-23338: - [~Subham] questions should be sent to the user

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356786#comment-16356786 ] Marco Gaido commented on SPARK-23244: - maybe we can close this as a duplicate of SPAR

[jira] [Comment Edited] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356786#comment-16356786 ] Marco Gaido edited comment on SPARK-23244 at 2/8/18 10:47 AM: -

[jira] [Commented] (SPARK-23041) Inconsistent `drop`ing of columns in dataframes

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356850#comment-16356850 ] Marco Gaido commented on SPARK-23041: - yes I am unable to reproduce this problem in m

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357473#comment-16357473 ] Marco Gaido commented on SPARK-23244: - The change is related because your problem is

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358315#comment-16358315 ] Marco Gaido commented on SPARK-23373: - I cannot reproduce on current master... May yo

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358402#comment-16358402 ] Marco Gaido commented on SPARK-23373: - Then I think we can close this, thanks. > Can

[jira] [Resolved] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23373. - Resolution: Cannot Reproduce > Can not execute "count distinct" queries on parquet formatted tabl

[jira] [Created] (SPARK-23375) Optimizer should remove unneeded Sort

2018-02-09 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23375: --- Summary: Optimizer should remove unneeded Sort Key: SPARK-23375 URL: https://issues.apache.org/jira/browse/SPARK-23375 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-23375) Optimizer should remove unneeded Sort

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23375: Description: As pointed out in SPARK-23368, as of now there is no rule to remove the Sort operator

[jira] [Commented] (SPARK-22105) Dataframe has poor performance when computing on many columns with codegen

2018-02-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359420#comment-16359420 ] Marco Gaido commented on SPARK-22105: - [~WeichenXu123] which is the number of rows fo

[jira] [Commented] (SPARK-23393) Path is error when run test in local machine

2018-02-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360471#comment-16360471 ] Marco Gaido commented on SPARK-23393: - I think this is a problem for your environment

[jira] [Commented] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360593#comment-16360593 ] Marco Gaido commented on SPARK-23394: - I think this is not an issue. `numCachedPartit

[jira] [Created] (SPARK-23412) Add cosine distance measure to BisectingKMeans

2018-02-13 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23412: --- Summary: Add cosine distance measure to BisectingKMeans Key: SPARK-23412 URL: https://issues.apache.org/jira/browse/SPARK-23412 Project: Spark Issue Type: Impr

[jira] [Commented] (SPARK-23411) Deprecate SparkContext.getExecutorStorageStatus

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362597#comment-16362597 ] Marco Gaido commented on SPARK-23411: - I think this method was removed in SPARK-20659

[jira] [Commented] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362758#comment-16362758 ] Marco Gaido commented on SPARK-23344: - [~srowen] I did it this way because I always s

[jira] [Commented] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362774#comment-16362774 ] Marco Gaido commented on SPARK-23344: - I see. It would be good indeed to decide in th

[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362920#comment-16362920 ] Marco Gaido commented on SPARK-23416: - I see this failing also with this stacktrace:

[jira] [Commented] (SPARK-23420) Datasource loading not handling paths with regex chars.

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363634#comment-16363634 ] Marco Gaido commented on SPARK-23420: - I don't remember the ticket number but this ma

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363657#comment-16363657 ] Marco Gaido commented on SPARK-23402: - I tried with Postgres 10, driver 42.2.1 and I

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363741#comment-16363741 ] Marco Gaido commented on SPARK-23402: - Yes the table existed. please try with the cur

[jira] [Commented] (SPARK-23234) ML python test failure due to default outputCol

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364680#comment-16364680 ] Marco Gaido commented on SPARK-23234: - [~josephkb] maybe it is not a blocker, but sin

[jira] [Commented] (SPARK-23436) Incorrect Date column Inference in partition discovery

2018-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365780#comment-16365780 ] Marco Gaido commented on SPARK-23436: - Thanks for reporting this. This affects also c

[jira] [Commented] (SPARK-23399) Register a task completion listener first for OrcColumnarBatchReader

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366788#comment-16366788 ] Marco Gaido commented on SPARK-23399: - I think we should reopen this, it is still hap

[jira] [Commented] (SPARK-23442) Reading from partitioned and bucketed table uses only bucketSpec.numBuckets partitions in all cases

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366898#comment-16366898 ] Marco Gaido commented on SPARK-23442: - I am not sure it is what you are looking for,

[jira] [Commented] (SPARK-23439) Ambiguous reference when selecting column inside StructType with same name that outer colum

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366945#comment-16366945 ] Marco Gaido commented on SPARK-23439: - [~cloud_fan] I think this comes from https://g

[jira] [Created] (SPARK-23451) Deprecate KMeans computeCost

2018-02-16 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23451: --- Summary: Deprecate KMeans computeCost Key: SPARK-23451 URL: https://issues.apache.org/jira/browse/SPARK-23451 Project: Spark Issue Type: Task Compone

[jira] [Created] (SPARK-23458) OrcSuite flaky test

2018-02-17 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23458: --- Summary: OrcSuite flaky test Key: SPARK-23458 URL: https://issues.apache.org/jira/browse/SPARK-23458 Project: Spark Issue Type: Task Components: SQL

[jira] [Commented] (SPARK-23458) OrcSuite flaky test

2018-02-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368295#comment-16368295 ] Marco Gaido commented on SPARK-23458: - cc [~dongjoon] > OrcSuite flaky test > --

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368968#comment-16368968 ] Marco Gaido commented on SPARK-23463: - sorry, what do you mean by blank values? Which

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370046#comment-16370046 ] Marco Gaido commented on SPARK-23463: - Hi [~m.bakshi11]. The problem is very easy. Th

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371186#comment-16371186 ] Marco Gaido commented on SPARK-23463: - It changed Spark's implicit casting. Probably

[jira] [Updated] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23473: Component/s: (was: Spark Core) SQL > spark.catalog.listTables error when datab

[jira] [Commented] (SPARK-23477) Misleading exception message when union fails due to metadata

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371238#comment-16371238 ] Marco Gaido commented on SPARK-23477: - I cannot reproduce this on master. > Misleadi

[jira] [Commented] (SPARK-23477) Misleading exception message when union fails due to metadata

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371278#comment-16371278 ] Marco Gaido commented on SPARK-23477: - [~kretes] yes. I think we can close this, do y

[jira] [Commented] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371279#comment-16371279 ] Marco Gaido commented on SPARK-23473: - Your stack error points out which is the real

[jira] [Resolved] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23473. - Resolution: Invalid > spark.catalog.listTables error when database name starts with a number > --

[jira] [Comment Edited] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371279#comment-16371279 ] Marco Gaido edited comment on SPARK-23473 at 2/21/18 11:53 AM:

[jira] [Commented] (SPARK-23475) The "stages" page doesn't show any completed stages

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371347#comment-16371347 ] Marco Gaido commented on SPARK-23475: - The reason of this behavior is that SKIPPED st

[jira] [Created] (SPARK-23489) HiveExternalCatalogVersionsSuite flaky test

2018-02-22 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23489: --- Summary: HiveExternalCatalogVersionsSuite flaky test Key: SPARK-23489 URL: https://issues.apache.org/jira/browse/SPARK-23489 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374146#comment-16374146 ] Marco Gaido commented on SPARK-23493: - I don't think this is an issue. I think this i

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374258#comment-16374258 ] Marco Gaido commented on SPARK-23493: - I don't think so. Partition columns are always

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374358#comment-16374358 ] Marco Gaido commented on SPARK-23493: - How can it know that you are not setting the p

[jira] [Commented] (SPARK-23496) Locality of coalesced partitions can be severely skewed by the order of input partitions

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374439#comment-16374439 ] Marco Gaido commented on SPARK-23496: - I read that the proposed solution is to use ra

[jira] [Commented] (SPARK-23496) Locality of coalesced partitions can be severely skewed by the order of input partitions

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16374530#comment-16374530 ] Marco Gaido commented on SPARK-23496: - [~ala.luszczak] thanks for your answer. Honest

[jira] [Created] (SPARK-23501) Refactor AllStagesPage in order to avoid redundant code

2018-02-23 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23501: --- Summary: Refactor AllStagesPage in order to avoid redundant code Key: SPARK-23501 URL: https://issues.apache.org/jira/browse/SPARK-23501 Project: Spark Issue T

[jira] [Commented] (SPARK-23531) When explain, plan's output should include attribute type info

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379967#comment-16379967 ] Marco Gaido commented on SPARK-23531: - I am working on this. I will submit a PR soon.

[jira] [Commented] (SPARK-23535) MinMaxScaler return 0.5 for an all zero column

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380334#comment-16380334 ] Marco Gaido commented on SPARK-23535: - I checked and each tool behaves in its own way

[jira] [Commented] (SPARK-23528) Expose vital statistics of GaussianMixtureModel

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380426#comment-16380426 ] Marco Gaido commented on SPARK-23528: - The log likelihood is already available in the

[jira] [Commented] (SPARK-23498) Accuracy problem in comparison with string and integer

2018-03-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383373#comment-16383373 ] Marco Gaido commented on SPARK-23498: - I think we are seeing many of these issues wit

[jira] [Created] (SPARK-23568) Silhouette should get number of features from metadata if available

2018-03-02 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23568: --- Summary: Silhouette should get number of features from metadata if available Key: SPARK-23568 URL: https://issues.apache.org/jira/browse/SPARK-23568 Project: Spark

[jira] [Commented] (SPARK-23598) WholeStageCodegen can lead to IllegalAccessError calling append for HashAggregateExec

2018-03-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386234#comment-16386234 ] Marco Gaido commented on SPARK-23598: - thanks for reporting this. Actually the one wh

[jira] [Commented] (SPARK-23590) Add interpreted execution for CreateExternalRow expression

2018-03-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387721#comment-16387721 ] Marco Gaido commented on SPARK-23590: - I am working on this > Add interpreted execut

[jira] [Commented] (SPARK-23592) Add interpreted execution for DecodeUsingSerializer expression

2018-03-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388075#comment-16388075 ] Marco Gaido commented on SPARK-23592: - I will submit a PR as soon as SPARK-23591 gets

[jira] [Created] (SPARK-23628) WholeStageCodegen can generate methods with too many params

2018-03-08 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23628: --- Summary: WholeStageCodegen can generate methods with too many params Key: SPARK-23628 URL: https://issues.apache.org/jira/browse/SPARK-23628 Project: Spark Is

[jira] [Commented] (SPARK-23598) WholeStageCodegen can lead to IllegalAccessError calling append for HashAggregateExec

2018-03-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391373#comment-16391373 ] Marco Gaido commented on SPARK-23598: - [~dvogelbacher] the parameter you are talking

[jira] [Created] (SPARK-23644) SHS with proxy doesn't show applications

2018-03-10 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23644: --- Summary: SHS with proxy doesn't show applications Key: SPARK-23644 URL: https://issues.apache.org/jira/browse/SPARK-23644 Project: Spark Issue Type: Improvemen

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16407112#comment-16407112 ] Marco Gaido commented on SPARK-23739: - Can you provide some more info about how you a

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411377#comment-16411377 ] Marco Gaido commented on SPARK-23739: - [~zsxwing] [~joseph.torres] [~c...@koeninger.o

[jira] [Created] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-23 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23782: --- Summary: SHS should not show applications to user without read permission Key: SPARK-23782 URL: https://issues.apache.org/jira/browse/SPARK-23782 Project: Spark

[jira] [Commented] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411836#comment-16411836 ] Marco Gaido commented on SPARK-23782: - [~vanzin] sorry but I have not been able to fi

[jira] [Commented] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412510#comment-16412510 ] Marco Gaido commented on SPARK-23782: - [~vanzin] thanks for the link. I see that in t

[jira] [Commented] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417149#comment-16417149 ] Marco Gaido commented on SPARK-23782: - [~vanzin] sorry but I cannot see any usability

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-03-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420335#comment-16420335 ] Marco Gaido commented on SPARK-23791: - Hi [~rednikotin]. Thanks for reporting this. T

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422064#comment-16422064 ] Marco Gaido commented on SPARK-23791: - Thanks, [~rednikotin]. The error you noticed i

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422068#comment-16422068 ] Marco Gaido commented on SPARK-23791: - Yes, I think you're right [~maropu]. Do you wa

[jira] [Commented] (SPARK-23835) When Dataset.as converts column from nullable to non-nullable type, null Doubles are converted silently to -1

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422082#comment-16422082 ] Marco Gaido commented on SPARK-23835: - Actually this is not the first time we see thi

[jira] [Commented] (SPARK-23902) Provide an option in months_between UDF to disable rounding-off

2018-04-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430214#comment-16430214 ] Marco Gaido commented on SPARK-23902: - I will work on this, thanks. > Provide an opt

[jira] [Commented] (SPARK-23916) High-order function: array_join(x, delimiter, null_replacement) → varchar

2018-04-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430230#comment-16430230 ] Marco Gaido commented on SPARK-23916: - I will work on this, thanks. > High-order fun

<    1   2   3   4   5   6   7   >