[jira] [Issue Comment Deleted] (SPARK-22946) Recursive withColumn calls cause org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2018-01-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-22946: Comment: was deleted (was: I am unable to reproduce on master. If I remember correctly, this

[jira] [Comment Edited] (SPARK-22946) Recursive withColumn calls cause org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2018-01-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320402#comment-16320402 ] Marco Gaido edited comment on SPARK-22946 at 1/10/18 3:13 PM: -- I am unable

[jira] [Created] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23080: --- Summary: Improve error message for built-in functions Key: SPARK-23080 URL: https://issues.apache.org/jira/browse/SPARK-23080 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326378#comment-16326378 ] Marco Gaido commented on SPARK-23078: - [~ozzieba] I see that in Kubernetes it might work, but I think

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326337#comment-16326337 ] Marco Gaido commented on SPARK-23078: - The problem is: in cluster mode you don't control where the

[jira] [Comment Edited] (SPARK-22923) Non-equi join(theta join) should use sort merge join

2018-01-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325074#comment-16325074 ] Marco Gaido edited comment on SPARK-22923 at 1/16/18 10:47 AM: --- I dob't

[jira] [Commented] (SPARK-23156) Code of method "initialize(I)V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2018-01-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332434#comment-16332434 ] Marco Gaido commented on SPARK-23156: - [~kzawisto] a lot of work on this has been done and it is both

[jira] [Updated] (SPARK-23087) CheckCartesianProduct too restrictive when condition is constant folded to false/null

2018-01-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23087: Priority: Minor (was: Major) > CheckCartesianProduct too restrictive when condition is constant

[jira] [Resolved] (SPARK-23225) Spark is infering decimal values with wrong precision

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23225. - Resolution: Duplicate > Spark is infering decimal values with wrong precision >

[jira] [Commented] (SPARK-23225) Spark is infering decimal values with wrong precision

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340813#comment-16340813 ] Marco Gaido commented on SPARK-23225: - I am not able to reproduce on master. May you provide a sample

[jira] [Created] (SPARK-23234) ML python test failure

2018-01-26 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23234: --- Summary: ML python test failure Key: SPARK-23234 URL: https://issues.apache.org/jira/browse/SPARK-23234 Project: Spark Issue Type: Bug Components:

[jira] [Updated] (SPARK-23234) ML python test failure

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23234: Description: SPARK-22799 and SPARK-22797 are causing valid Python test failures. The reason is

[jira] [Updated] (SPARK-23234) ML python test failure due to default outputCol

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23234: Summary: ML python test failure due to default outputCol (was: ML python test failure) > ML

[jira] [Updated] (SPARK-23234) ML python test failure due to default outputCol

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23234: Description: SPARK-22799 and SPARK-22797 are causing valid Python test failures. The reason is

[jira] [Commented] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)

2018-01-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16330269#comment-16330269 ] Marco Gaido commented on SPARK-23130: - [~seano] there is no JIRA for the pipeout issue and there

[jira] [Resolved] (SPARK-23212) Casts the column to a different data type.

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23212. - Resolution: Invalid This is not the right place. For questions, please use the user mailing

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: SPARK-23217.pdf > Add cosine distance measure to ClusteringEvaluator >

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: (was: SPARK-23217.pages) > Add cosine distance measure to ClusteringEvaluator >

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: SPARK-23217.pages > Add cosine distance measure to ClusteringEvaluator >

[jira] [Created] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23217: --- Summary: Add cosine distance measure to ClusteringEvaluator Key: SPARK-23217 URL: https://issues.apache.org/jira/browse/SPARK-23217 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22923) Non-equi join(theta join) should use sort merge join

2018-01-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325074#comment-16325074 ] Marco Gaido commented on SPARK-22923: - I dob't think SortMergeJoinExec can be used, since the

[jira] [Created] (SPARK-23055) KafkaContinuousSourceSuite Kafka column types test failing

2018-01-12 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23055: --- Summary: KafkaContinuousSourceSuite Kafka column types test failing Key: SPARK-23055 URL: https://issues.apache.org/jira/browse/SPARK-23055 Project: Spark

[jira] [Commented] (SPARK-23273) Spark Dataset withColumn - schema column order isn't the same as case class paramether order

2018-01-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346758#comment-16346758 ] Marco Gaido commented on SPARK-23273: - [~viirya] I don't think that this would solve this problem.

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348622#comment-16348622 ] Marco Gaido commented on SPARK-22575: - I think STS is the only Spark application where this can

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348548#comment-16348548 ] Marco Gaido commented on SPARK-22575: - I am not able to reproduce the issue. May I ask you to provide

[jira] [Comment Edited] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348548#comment-16348548 ] Marco Gaido edited comment on SPARK-22575 at 2/1/18 1:06 PM: - I am not able

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348603#comment-16348603 ] Marco Gaido commented on SPARK-22575: - Then the problem is likely that the executors are killed in

[jira] [Resolved] (SPARK-22692) Reduce the number of generated mutable states

2018-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22692. - Resolution: Fixed > Reduce the number of generated mutable states >

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357473#comment-16357473 ] Marco Gaido commented on SPARK-23244: - The change is related because your problem is caused by the

[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356771#comment-16356771 ] Marco Gaido commented on SPARK-23338: - [~Subham] questions should be sent to the user mailing list,

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356786#comment-16356786 ] Marco Gaido commented on SPARK-23244: - maybe we can close this as a duplicate of SPARK-23234. Anyway,

[jira] [Created] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-06 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23344: --- Summary: Add KMeans distanceMeasure param to PySpark Key: SPARK-23344 URL: https://issues.apache.org/jira/browse/SPARK-23344 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23393) Path is error when run test in local machine

2018-02-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360471#comment-16360471 ] Marco Gaido commented on SPARK-23393: - I think this is a problem for your environment. THe path is

[jira] [Commented] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360593#comment-16360593 ] Marco Gaido commented on SPARK-23394: - I think this is not an issue. `numCachedPartitions ` is 20

[jira] [Created] (SPARK-23412) Add cosine distance measure to BisectingKMeans

2018-02-13 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23412: --- Summary: Add cosine distance measure to BisectingKMeans Key: SPARK-23412 URL: https://issues.apache.org/jira/browse/SPARK-23412 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362758#comment-16362758 ] Marco Gaido commented on SPARK-23344: - [~srowen] I did it this way because I always say doing so. Not

[jira] [Commented] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362774#comment-16362774 ] Marco Gaido commented on SPARK-23344: - I see. It would be good indeed to decide in the community a

[jira] [Commented] (SPARK-23411) Deprecate SparkContext.getExecutorStorageStatus

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362597#comment-16362597 ] Marco Gaido commented on SPARK-23411: - I think this method was removed in SPARK-20659. So I think

[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362920#comment-16362920 ] Marco Gaido commented on SPARK-23416: - I see this failing also with this stacktrace: {code:java}

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363741#comment-16363741 ] Marco Gaido commented on SPARK-23402: - Yes the table existed. please try with the current master. I

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363657#comment-16363657 ] Marco Gaido commented on SPARK-23402: - I tried with Postgres 10, driver 42.2.1 and I was unable to

[jira] [Commented] (SPARK-23420) Datasource loading not handling paths with regex chars.

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363634#comment-16363634 ] Marco Gaido commented on SPARK-23420: - I don't remember the ticket number but this may be solved. May

[jira] [Commented] (SPARK-23234) ML python test failure due to default outputCol

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364680#comment-16364680 ] Marco Gaido commented on SPARK-23234: - [~josephkb] maybe it is not a blocker, but since this can

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358315#comment-16358315 ] Marco Gaido commented on SPARK-23373: - I cannot reproduce on current master... May you try and check

[jira] [Commented] (SPARK-22105) Dataframe has poor performance when computing on many columns with codegen

2018-02-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359420#comment-16359420 ] Marco Gaido commented on SPARK-22105: - [~WeichenXu123] which is the number of rows for the dataset

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358402#comment-16358402 ] Marco Gaido commented on SPARK-23373: - Then I think we can close this, thanks. > Can not execute

[jira] [Resolved] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23373. - Resolution: Cannot Reproduce > Can not execute "count distinct" queries on parquet formatted

[jira] [Updated] (SPARK-23375) Optimizer should remove unneeded Sort

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23375: Description: As pointed out in SPARK-23368, as of now there is no rule to remove the Sort

[jira] [Created] (SPARK-23375) Optimizer should remove unneeded Sort

2018-02-09 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23375: --- Summary: Optimizer should remove unneeded Sort Key: SPARK-23375 URL: https://issues.apache.org/jira/browse/SPARK-23375 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356786#comment-16356786 ] Marco Gaido edited comment on SPARK-23244 at 2/8/18 10:47 AM: -- maybe we can

[jira] [Commented] (SPARK-23041) Inconsistent `drop`ing of columns in dataframes

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356850#comment-16356850 ] Marco Gaido commented on SPARK-23041: - yes I am unable to reproduce this problem in master branch. >

[jira] [Commented] (SPARK-23439) Ambiguous reference when selecting column inside StructType with same name that outer colum

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366945#comment-16366945 ] Marco Gaido commented on SPARK-23439: - [~cloud_fan] I think this comes from

[jira] [Commented] (SPARK-23399) Register a task completion listener first for OrcColumnarBatchReader

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366788#comment-16366788 ] Marco Gaido commented on SPARK-23399: - I think we should reopen this, it is still happening:

[jira] [Commented] (SPARK-23442) Reading from partitioned and bucketed table uses only bucketSpec.numBuckets partitions in all cases

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366898#comment-16366898 ] Marco Gaido commented on SPARK-23442: - I am not sure it is what you are looking for, but you can

[jira] [Commented] (SPARK-23436) Incorrect Date column Inference in partition discovery

2018-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365780#comment-16365780 ] Marco Gaido commented on SPARK-23436: - Thanks for reporting this. This affects also current branch. I

[jira] [Created] (SPARK-23458) OrcSuite flaky test

2018-02-17 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23458: --- Summary: OrcSuite flaky test Key: SPARK-23458 URL: https://issues.apache.org/jira/browse/SPARK-23458 Project: Spark Issue Type: Task Components: SQL

[jira] [Commented] (SPARK-23458) OrcSuite flaky test

2018-02-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368295#comment-16368295 ] Marco Gaido commented on SPARK-23458: - cc [~dongjoon] > OrcSuite flaky test > --- >

[jira] [Updated] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23473: Component/s: (was: Spark Core) SQL > spark.catalog.listTables error when

[jira] [Commented] (SPARK-23475) The "stages" page doesn't show any completed stages

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371347#comment-16371347 ] Marco Gaido commented on SPARK-23475: - The reason of this behavior is that SKIPPED stages, which were

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368968#comment-16368968 ] Marco Gaido commented on SPARK-23463: - sorry, what do you mean by blank values? Which is the type of

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370046#comment-16370046 ] Marco Gaido commented on SPARK-23463: - Hi [~m.bakshi11]. The problem is very easy. The column `val`

[jira] [Created] (SPARK-23451) Deprecate KMeans computeCost

2018-02-16 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23451: --- Summary: Deprecate KMeans computeCost Key: SPARK-23451 URL: https://issues.apache.org/jira/browse/SPARK-23451 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-23496) Locality of coalesced partitions can be severely skewed by the order of input partitions

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374530#comment-16374530 ] Marco Gaido commented on SPARK-23496: - [~ala.luszczak] thanks for your answer. Honestly I don't see

[jira] [Created] (SPARK-23501) Refactor AllStagesPage in order to avoid redundant code

2018-02-23 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23501: --- Summary: Refactor AllStagesPage in order to avoid redundant code Key: SPARK-23501 URL: https://issues.apache.org/jira/browse/SPARK-23501 Project: Spark Issue

[jira] [Commented] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371279#comment-16371279 ] Marco Gaido commented on SPARK-23473: - Your stack error points out which is the real issue: {code}

[jira] [Resolved] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23473. - Resolution: Invalid > spark.catalog.listTables error when database name starts with a number >

[jira] [Comment Edited] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371279#comment-16371279 ] Marco Gaido edited comment on SPARK-23473 at 2/21/18 11:53 AM: --- Your stack

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371186#comment-16371186 ] Marco Gaido commented on SPARK-23463: - It changed Spark's implicit casting. Probably in 2.1.1

[jira] [Commented] (SPARK-23477) Misleading exception message when union fails due to metadata

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371278#comment-16371278 ] Marco Gaido commented on SPARK-23477: - [~kretes] yes. I think we can close this, do you agree? >

[jira] [Commented] (SPARK-23477) Misleading exception message when union fails due to metadata

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371238#comment-16371238 ] Marco Gaido commented on SPARK-23477: - I cannot reproduce this on master. > Misleading exception

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374258#comment-16374258 ] Marco Gaido commented on SPARK-23493: - I don't think so. Partition columns are always at the end. If

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374146#comment-16374146 ] Marco Gaido commented on SPARK-23493: - I don't think this is an issue. I think this is the expected

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374358#comment-16374358 ] Marco Gaido commented on SPARK-23493: - How can it know that you are not setting the partition column

[jira] [Created] (SPARK-23489) HiveExternalCatalogVersionsSuite flaky test

2018-02-22 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23489: --- Summary: HiveExternalCatalogVersionsSuite flaky test Key: SPARK-23489 URL: https://issues.apache.org/jira/browse/SPARK-23489 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-23496) Locality of coalesced partitions can be severely skewed by the order of input partitions

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374439#comment-16374439 ] Marco Gaido commented on SPARK-23496: - I read that the proposed solution is to use random numbers

[jira] [Resolved] (SPARK-21828) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB...again

2017-12-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-21828. - Resolution: Duplicate >

[jira] [Created] (SPARK-22904) Basic tests for decimal operations and string cast

2017-12-26 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22904: --- Summary: Basic tests for decimal operations and string cast Key: SPARK-22904 URL: https://issues.apache.org/jira/browse/SPARK-22904 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24606) Decimals multiplication and division may be null due to the result precision overflow

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24606: Priority: Major (was: Blocker) > Decimals multiplication and division may be null due to the

[jira] [Commented] (SPARK-24606) Decimals multiplication and division may be null due to the result precision overflow

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518130#comment-16518130 ] Marco Gaido commented on SPARK-24606: - Critical and Blocker are reserved for committers. Closing as

[jira] [Resolved] (SPARK-24606) Decimals multiplication and division may be null due to the result precision overflow

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24606. - Resolution: Duplicate > Decimals multiplication and division may be null due to the result

[jira] [Commented] (SPARK-24607) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518196#comment-16518196 ] Marco Gaido commented on SPARK-24607: - [~viirya] please check the description in the Hive ticket.

[jira] [Commented] (SPARK-24498) Add JDK compiler for runtime codegen

2018-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520125#comment-16520125 ] Marco Gaido commented on SPARK-24498: - Thanks for your great analysis [~maropu]! Very interesting.

[jira] [Commented] (SPARK-24598) SPARK SQL:Datatype overflow conditions gives incorrect result

2018-08-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568005#comment-16568005 ] Marco Gaido commented on SPARK-24598: - [~smilegator] as we just enhanced the doc, but we have not

[jira] [Commented] (SPARK-23937) High-order function: map_filter(map, function) → MAP

2018-08-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568152#comment-16568152 ] Marco Gaido commented on SPARK-23937: - I am working on this, thanks. > High-order function:

[jira] [Commented] (SPARK-24957) Decimal arithmetic can lead to wrong values using codegen

2018-07-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561077#comment-16561077 ] Marco Gaido commented on SPARK-24957: - I am not sure what you mean by "When codegen is disabled all

[jira] [Commented] (SPARK-24944) SparkUi build problem

2018-07-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559790#comment-16559790 ] Marco Gaido commented on SPARK-24944: - This seems more a problem in your project and your

[jira] [Created] (SPARK-24948) SHS filters wrongly some applications due to permission check

2018-07-27 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24948: --- Summary: SHS filters wrongly some applications due to permission check Key: SPARK-24948 URL: https://issues.apache.org/jira/browse/SPARK-24948 Project: Spark

[jira] [Commented] (SPARK-24975) Spark history server REST API /api/v1/version returns error 404

2018-07-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563259#comment-16563259 ] Marco Gaido commented on SPARK-24975: - This seems a duplicate of SPARK-24188. Despite here I see

[jira] [Commented] (SPARK-24928) spark sql cross join running time too long

2018-07-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558287#comment-16558287 ] Marco Gaido commented on SPARK-24928: - The affected version is pretty old, can you check a newer

[jira] [Commented] (SPARK-24944) SparkUi build problem

2018-07-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561587#comment-16561587 ] Marco Gaido commented on SPARK-24944: - Can you close this JIRA as invalid? Thanks. > SparkUi build

[jira] [Commented] (SPARK-25031) The schema of MapType can not be printed correctly

2018-08-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573299#comment-16573299 ] Marco Gaido commented on SPARK-25031: - [~smilegator] shall this be resolved as

[jira] [Created] (SPARK-25123) SimpleExprValue may cause the loss of a reference

2018-08-15 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25123: --- Summary: SimpleExprValue may cause the loss of a reference Key: SPARK-25123 URL: https://issues.apache.org/jira/browse/SPARK-25123 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23908) High-order function: transform(array, function) → array

2018-08-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581364#comment-16581364 ] Marco Gaido commented on SPARK-23908: - [~huaxingao] they are not exposed through the Scala API, so

[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb

2018-08-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577593#comment-16577593 ] Marco Gaido commented on SPARK-25094: - This is a duplicate of many. Unfortunately this problem has

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579514#comment-16579514 ] Marco Gaido commented on SPARK-25051: - This was caused by the introduction of AnalysisBarrier. I

[jira] [Updated] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25051: Labels: correctness (was: ) > where clause on dataset gives AnalysisException >

[jira] [Commented] (SPARK-25051) where clause on dataset gives AnalysisException

2018-08-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579444#comment-16579444 ] Marco Gaido commented on SPARK-25051: - cc [~jerryshao] shall we set it as a blocker for 2.3.2? >

[jira] [Commented] (SPARK-25125) Spark SQL percentile_approx takes longer than Hive version for large datasets

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582383#comment-16582383 ] Marco Gaido commented on SPARK-25125: - I think his may be a duplicate of SPARK-25125. [~myali] may

[jira] [Commented] (SPARK-25093) CodeFormatter could avoid creating regex object again and again

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582543#comment-16582543 ] Marco Gaido commented on SPARK-25093: - [~igreenfi] do you want to submit a PR for this? Otherwise I

[jira] [Commented] (SPARK-25031) The schema of MapType can not be printed correctly

2018-08-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582531#comment-16582531 ] Marco Gaido commented on SPARK-25031: - ^ kindly ping [~smilegator] > The schema of MapType can not

<    1   2   3   4   5   6   7   >