[jira] [Commented] (SPARK-22420) Spark SQL return invalid json string for struct with date/datetime field

2017-11-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237368#comment-16237368 ] Marco Gaido commented on SPARK-22420: - I think this is related and will be resolved by SPARK-20202 >

[jira] [Commented] (SPARK-22418) Add test cases for NULL Handling

2017-11-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237442#comment-16237442 ] Marco Gaido commented on SPARK-22418: - can I work on this? > Add test cases for NULL Handling >

[jira] [Created] (SPARK-22413) Type coercion for IN is not coherent between Literals and subquery

2017-11-01 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22413: --- Summary: Type coercion for IN is not coherent between Literals and subquery Key: SPARK-22413 URL: https://issues.apache.org/jira/browse/SPARK-22413 Project: Spark

[jira] [Commented] (SPARK-22440) Add Calinski-Harabasz index to ClusteringEvaluator

2017-11-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238095#comment-16238095 ] Marco Gaido commented on SPARK-22440: - I am preparing an implementation for this. It will stil take

[jira] [Created] (SPARK-22440) Add Calinski-Harabasz index to ClusteringEvaluator

2017-11-03 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22440: --- Summary: Add Calinski-Harabasz index to ClusteringEvaluator Key: SPARK-22440 URL: https://issues.apache.org/jira/browse/SPARK-22440 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22440) Add Calinski-Harabasz index to ClusteringEvaluator

2017-11-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238241#comment-16238241 ] Marco Gaido commented on SPARK-22440: - Honestly I don't know what people are using for clustering

[jira] [Commented] (SPARK-21725) spark thriftserver insert overwrite table partition select

2017-11-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233858#comment-16233858 ] Marco Gaido commented on SPARK-21725: - [~zhangxin0112zx] Can you share the spark-thriftserver logs?

[jira] [Commented] (SPARK-21725) spark thriftserver insert overwrite table partition select

2017-11-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234105#comment-16234105 ] Marco Gaido commented on SPARK-21725: - I tried using a mysql metastore and the target package, on a

[jira] [Commented] (SPARK-22371) dag-scheduler-event-loop thread stopped with error Attempted to access garbage collected accumulator 5605982

2017-11-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234112#comment-16234112 ] Marco Gaido commented on SPARK-22371: - Could you please provide an easy way to reproduce the issue?

[jira] [Commented] (SPARK-22398) Partition directories with leading 0s cause wrong results

2017-11-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234192#comment-16234192 ] Marco Gaido commented on SPARK-22398: - [~viirya] sorry for the unrequested ping, I saw that you

[jira] [Resolved] (SPARK-22460) Spark De-serialization of Timestamp field is Incorrect

2017-11-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22460. - Resolution: Invalid > Spark De-serialization of Timestamp field is Incorrect >

[jira] [Commented] (SPARK-22460) Spark De-serialization of Timestamp field is Incorrect

2017-11-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241806#comment-16241806 ] Marco Gaido commented on SPARK-22460: - this is not a Spark issue, but this is a problem of the

[jira] [Commented] (SPARK-22460) Spark De-serialization of Timestamp field is Incorrect

2017-11-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242135#comment-16242135 ] Marco Gaido commented on SPARK-22460: - [~saniyat...@gmail.com] I did the same things using both

[jira] [Commented] (SPARK-22478) Spark - Truncate date by Day / Hour

2017-11-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245748#comment-16245748 ] Marco Gaido commented on SPARK-22478: - The reason why {{TRUNC}} works like this is for compatibility

[jira] [Created] (SPARK-22473) Replace deprecated AsyncAssertions.Waiter and methods of java.sql.Date

2017-11-08 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22473: --- Summary: Replace deprecated AsyncAssertions.Waiter and methods of java.sql.Date Key: SPARK-22473 URL: https://issues.apache.org/jira/browse/SPARK-22473 Project: Spark

[jira] [Comment Edited] (SPARK-22472) Datasets generate random values for null primitive types

2017-11-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244310#comment-16244310 ] Marco Gaido edited comment on SPARK-22472 at 11/8/17 4:58 PM: -- Two things:

[jira] [Commented] (SPARK-22472) Datasets generate random values for null primitive types

2017-11-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244310#comment-16244310 ] Marco Gaido commented on SPARK-22472: - Two things: 1 - if you use `as[Option[Long]]`, it works fine;

[jira] [Commented] (SPARK-21725) spark thriftserver insert overwrite table partition select

2017-11-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234332#comment-16234332 ] Marco Gaido commented on SPARK-21725: - I don't have any idea about which is the difference. Please

[jira] [Commented] (SPARK-22398) Partition directories with leading 0s cause wrong results

2017-11-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235708#comment-16235708 ] Marco Gaido commented on SPARK-22398: - [~hyukjin.kwon] I think that here there are two points: 1)

[jira] [Commented] (SPARK-22398) Partition directories with leading 0s cause wrong results

2017-11-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235710#comment-16235710 ] Marco Gaido commented on SPARK-22398: - [~viirya] I see your point. Thanks for your answer. >

[jira] [Resolved] (SPARK-21725) spark thriftserver insert overwrite table partition select

2017-11-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-21725. - Resolution: Not A Bug > spark thriftserver insert overwrite table partition select >

[jira] [Commented] (SPARK-19759) ALSModel.predict on Dataframes : potential optimization by not using blas

2017-11-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242314#comment-16242314 ] Marco Gaido commented on SPARK-19759: - I tried comparing the current implementation with an easy for

[jira] [Commented] (SPARK-20299) NullPointerException when null and string are in a tuple while encoding Dataset

2017-12-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274586#comment-16274586 ] Marco Gaido commented on SPARK-20299: - I found the problems. I think some checks are missing in

[jira] [Comment Edited] (SPARK-20299) NullPointerException when null and string are in a tuple while encoding Dataset

2017-12-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274586#comment-16274586 ] Marco Gaido edited comment on SPARK-20299 at 12/3/17 9:21 AM: -- I found the

[jira] [Commented] (SPARK-22751) Improve ML RandomForest shuffle performance

2017-12-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16285729#comment-16285729 ] Marco Gaido commented on SPARK-22751: - You can submit a PR on github, if you have a working solution

[jira] [Created] (SPARK-22752) FileNotFoundException while reading from Kafka

2017-12-11 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22752: --- Summary: FileNotFoundException while reading from Kafka Key: SPARK-22752 URL: https://issues.apache.org/jira/browse/SPARK-22752 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-22698) Avoid the generation of useless mutable states by GenerateUnsafeProjection

2017-12-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22698. - Resolution: Invalid I am closing since I have not been able to solve the issue without

[jira] [Commented] (SPARK-22692) Reduce the number of generated mutable states

2017-12-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278458#comment-16278458 ] Marco Gaido commented on SPARK-22692: - I felt this was the best way to go in order to split the

[jira] [Resolved] (SPARK-22694) Avoid the generation of useless mutable states by regexp functions

2017-12-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22694. - Resolution: Invalid > Avoid the generation of useless mutable states by regexp functions >

[jira] [Resolved] (SPARK-22684) Avoid the generation of useless mutable states by datetime functions

2017-12-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22684. - Resolution: Invalid > Avoid the generation of useless mutable states by datetime functions >

[jira] [Created] (SPARK-22715) Reuse array in CreateNamedStruct

2017-12-06 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22715: --- Summary: Reuse array in CreateNamedStruct Key: SPARK-22715 URL: https://issues.apache.org/jira/browse/SPARK-22715 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-22750) Introduce reusable mutable states

2017-12-10 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-22750: --- Summary: Introduce reusable mutable states Key: SPARK-22750 URL: https://issues.apache.org/jira/browse/SPARK-22750 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-22752) FileNotFoundException while reading from Kafka

2017-12-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287305#comment-16287305 ] Marco Gaido commented on SPARK-22752: - Hi [~zsxwing], thanks for looking at this. The checkpointDir

[jira] [Commented] (SPARK-22761) 64KB JVM bytecode limit problem with GLM

2017-12-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287394#comment-16287394 ] Marco Gaido commented on SPARK-22761: - this is probably solved by SPARK-6. Can you try to

[jira] [Commented] (SPARK-22478) Spark - Truncate date by Day / Hour

2017-12-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275979#comment-16275979 ] Marco Gaido commented on SPARK-22478: - [~Davidhod] if other people agree that we should do that, I

[jira] [Commented] (SPARK-22629) incorrect handling of calls to random in UDFs

2017-12-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284166#comment-16284166 ] Marco Gaido commented on SPARK-22629: - This problem is related to SPARK-20586. I created a PR to add

[jira] [Resolved] (SPARK-22697) Avoid the generation of useless mutable states by GenerateMutableProjection

2017-12-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22697. - Resolution: Invalid > Avoid the generation of useless mutable states by

[jira] [Commented] (SPARK-21725) spark thriftserver insert overwrite table partition select

2017-10-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220228#comment-16220228 ] Marco Gaido commented on SPARK-21725: - please try with the master branch, not with Spark 2.1.2. I

[jira] [Commented] (SPARK-21725) spark thriftserver insert overwrite table partition select

2017-10-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226720#comment-16226720 ] Marco Gaido commented on SPARK-21725: - [~zhangxin0112zx] I am sorry but I am still unable to

[jira] [Commented] (SPARK-22398) Partition directories with leading 0s cause wrong results

2017-10-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226991#comment-16226991 ] Marco Gaido commented on SPARK-22398: - you just need to set

[jira] [Commented] (SPARK-24189) Spark Strcutured Streaming not working with the Kafka Transactions

2018-05-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464700#comment-16464700 ] Marco Gaido commented on SPARK-24189: - all kafka option should be set with the prefix {{kafka.}}.

[jira] [Commented] (SPARK-24088) only HadoopRDD leverage HDFS Cache as preferred location

2018-05-04 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463653#comment-16463653 ] Marco Gaido commented on SPARK-24088: - [~xiaojuwu] I don't understand which problem is stated here.

[jira] [Commented] (SPARK-24177) Spark returning inconsistent rows and data in a join query when run using Spark SQL (using SQLContext.sql(...))

2018-05-04 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463616#comment-16463616 ] Marco Gaido commented on SPARK-24177: - [~ajay_monga] please may you try with a higher spark version?

[jira] [Created] (SPARK-24268) DataType in error messages are not coherent

2018-05-14 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24268: --- Summary: DataType in error messages are not coherent Key: SPARK-24268 URL: https://issues.apache.org/jira/browse/SPARK-24268 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24089) DataFrame.write().mode(SaveMode.Append).insertInto(TABLE)

2018-04-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456230#comment-16456230 ] Marco Gaido commented on SPARK-24089: - [~rkrgarlapati] the problem is that you are not inserting to

[jira] [Comment Edited] (SPARK-24089) DataFrame.write().mode(SaveMode.Append).insertInto(TABLE)

2018-04-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456230#comment-16456230 ] Marco Gaido edited comment on SPARK-24089 at 4/27/18 11:01 AM: ---

[jira] [Created] (SPARK-24209) 0 configuration Knox gateway support in SHS

2018-05-08 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24209: --- Summary: 0 configuration Knox gateway support in SHS Key: SPARK-24209 URL: https://issues.apache.org/jira/browse/SPARK-24209 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24189) Spark Strcutured Streaming not working with the Kafka Transactions

2018-05-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464788#comment-16464788 ] Marco Gaido commented on SPARK-24189: - [~abharath9] how are you submitting this application? Are you

[jira] [Commented] (SPARK-24189) Spark Strcutured Streaming not working with the Kafka Transactions

2018-05-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464798#comment-16464798 ] Marco Gaido commented on SPARK-24189: - You need to add also: {code} org.apache.kafka kafka-clients

[jira] [Updated] (SPARK-23778) SparkContext.emptyRDD confuses SparkContext.union

2018-05-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23778: Priority: Trivial (was: Minor) > SparkContext.emptyRDD confuses SparkContext.union >

[jira] [Commented] (SPARK-24298) PCAModel Memory in Pipeline

2018-05-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478802#comment-16478802 ] Marco Gaido commented on SPARK-24298: - May you please provide a small program/simple list of steps to

[jira] [Updated] (SPARK-24313) array_contains/array_position interpreted execution doesn't work with complex types

2018-05-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24313: Summary: array_contains/array_position interpreted execution doesn't work with complex types

[jira] [Updated] (SPARK-24313) array_contains/array_position interpreted execution doesn't work with complex types

2018-05-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24313: Description: The functions {{array_contains}} and {{array_position}} return incorrect result for

[jira] [Created] (SPARK-24313) array_contains interpreted execution doesn't work with complex types

2018-05-18 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24313: --- Summary: array_contains interpreted execution doesn't work with complex types Key: SPARK-24313 URL: https://issues.apache.org/jira/browse/SPARK-24313 Project: Spark

[jira] [Created] (SPARK-24315) Multiple streaming jobs detected error causing job failure

2018-05-18 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24315: --- Summary: Multiple streaming jobs detected error causing job failure Key: SPARK-24315 URL: https://issues.apache.org/jira/browse/SPARK-24315 Project: Spark

[jira] [Updated] (SPARK-24313) Collection functions interpreted execution doesn't work with complex types

2018-05-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24313: Summary: Collection functions interpreted execution doesn't work with complex types (was:

[jira] [Updated] (SPARK-24313) Collection functions interpreted execution doesn't work with complex types

2018-05-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24313: Description: Several functions working on collection return incorrect result for complex data

[jira] [Commented] (SPARK-24260) Support for multi-statement SQL in SparkSession.sql API

2018-05-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475678#comment-16475678 ] Marco Gaido commented on SPARK-24260: - I don't think it is a good idea. Ho can you return the result

[jira] [Updated] (SPARK-24344) Spark SQL Thrift Server issue

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24344: Priority: Major (was: Blocker) > Spark SQL Thrift Server issue > - >

[jira] [Commented] (SPARK-24344) Spark SQL Thrift Server issue

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483829#comment-16483829 ] Marco Gaido commented on SPARK-24344: - I moved to Major as Critical and Blocker are reserved for

[jira] [Commented] (SPARK-24341) Codegen compile error from predicate subquery

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483846#comment-16483846 ] Marco Gaido commented on SPARK-24341: - This is an issue in the Optimizer, rather than a codegen

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490551#comment-16490551 ] Marco Gaido commented on SPARK-24373: - [~wbzhao] yes, I do agree with you. That is the problem. >

[jira] [Commented] (SPARK-24389) describe() can't work on column that name contain dots

2018-05-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490797#comment-16490797 ] Marco Gaido commented on SPARK-24389: - I cannot reproduce on current master. Probably it has been

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491124#comment-16491124 ] Marco Gaido commented on SPARK-24373: - [~smilegator] I think an eager API is not related to the

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491271#comment-16491271 ] Marco Gaido commented on SPARK-24373: - [~smilegator] yes, you're right, the impact would be

[jira] [Resolved] (SPARK-24315) Multiple streaming jobs detected error causing job failure

2018-05-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24315. - Resolution: Not A Bug > Multiple streaming jobs detected error causing job failure >

[jira] [Created] (SPARK-24531) HiveExternalCatalogVersionsSuite failing due to missing 2.2.0 version

2018-06-12 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24531: --- Summary: HiveExternalCatalogVersionsSuite failing due to missing 2.2.0 version Key: SPARK-24531 URL: https://issues.apache.org/jira/browse/SPARK-24531 Project: Spark

[jira] [Commented] (SPARK-24481) GeneratedIteratorForCodegenStage1 grows beyond 64 KB

2018-06-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506183#comment-16506183 ] Marco Gaido commented on SPARK-24481: - Yes, because before SPARK-22520 code generation was disabled

[jira] [Commented] (SPARK-24481) GeneratedIteratorForCodegenStage1 grows beyond 64 KB

2018-06-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504828#comment-16504828 ] Marco Gaido commented on SPARK-24481: - Thanks for reporting this. I am investigating more, but I

[jira] [Commented] (SPARK-23901) Data Masking Functions

2018-06-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514687#comment-16514687 ] Marco Gaido commented on SPARK-23901: - These functions can be used as any other function in Hive,

[jira] [Commented] (SPARK-23931) High-order function: array_zip(array1, array2[, ...]) → array

2018-06-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510079#comment-16510079 ] Marco Gaido commented on SPARK-23931: - I just edited the title/description in order to update to the

[jira] [Updated] (SPARK-23931) High-order function: array_zip(array1, array2[, ...]) → array

2018-06-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23931: Description: Ref: https://prestodb.io/docs/current/functions/array.html Merges the given arrays,

[jira] [Updated] (SPARK-23931) High-order function: array_zip(array1, array2[, ...]) → array

2018-06-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23931: Summary: High-order function: array_zip(array1, array2[, ...]) → array (was: High-order

[jira] [Resolved] (SPARK-24481) GeneratedIteratorForCodegenStage1 grows beyond 64 KB

2018-06-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24481. - Resolution: Not A Problem I am resolving this as no further action can be taken IMHO (other

[jira] [Comment Edited] (SPARK-24481) GeneratedIteratorForCodegenStage1 grows beyond 64 KB

2018-06-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509662#comment-16509662 ] Marco Gaido edited comment on SPARK-24481 at 6/12/18 2:03 PM: -- I am

[jira] [Created] (SPARK-24562) Allow running same tests with multiple configs in SQLQueryTestSuite

2018-06-14 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24562: --- Summary: Allow running same tests with multiple configs in SQLQueryTestSuite Key: SPARK-24562 URL: https://issues.apache.org/jira/browse/SPARK-24562 Project: Spark

[jira] [Commented] (SPARK-24510) Spark WebUI filters use Basic Authentication [security]

2018-06-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507976#comment-16507976 ] Marco Gaido commented on SPARK-24510: - I am not sure this is a real issue. You can configure many

[jira] [Commented] (SPARK-24509) Spark WebUI [security] - Web Server Version Disclosure

2018-06-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508032#comment-16508032 ] Marco Gaido commented on SPARK-24509: - I see the point, but Spark is open source, so anybody knows

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496331#comment-16496331 ] Marco Gaido commented on SPARK-24437: - I remember another JIRA about this. Anyway, this is indeed a

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493548#comment-16493548 ] Marco Gaido commented on SPARK-24373: - [~wbzhao] as I answered on the PR, the fix is complete and

[jira] [Commented] (SPARK-24395) Fix Behavior of NOT IN with Literals Containing NULL

2018-05-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495068#comment-16495068 ] Marco Gaido commented on SPARK-24395: - The main issue here is that {{(null, null) = (1, 2)}} in

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496910#comment-16496910 ] Marco Gaido commented on SPARK-24437: - Reproducing the issue is quite easy: you just need to run

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-06-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498072#comment-16498072 ] Marco Gaido commented on SPARK-24437: - [~tanejagagan] the map you are referring seems to be the

[jira] [Commented] (SPARK-24468) DecimalType `adjustPrecisionScale` might fail when scale is negative

2018-06-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501655#comment-16501655 ] Marco Gaido commented on SPARK-24468: - Thanks for reporting this. I will submit soon a fix. Thanks.

[jira] [Resolved] (SPARK-24389) describe() can't work on column that name contain dots

2018-05-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24389. - Resolution: Cannot Reproduce > describe() can't work on column that name contain dots >

[jira] [Commented] (SPARK-24401) Aggreate on Decimal Types does not work

2018-05-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492781#comment-16492781 ] Marco Gaido commented on SPARK-24401: - I followed the repro steps using the file you attached and I

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-06-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498201#comment-16498201 ] Marco Gaido commented on SPARK-24437: - I checked and it seems that the leakage you are reporting

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496585#comment-16496585 ] Marco Gaido commented on SPARK-24437: - I just remembered that I started working on this some time

[jira] [Commented] (SPARK-24437) Memory leak in UnsafeHashedRelation

2018-05-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496598#comment-16496598 ] Marco Gaido commented on SPARK-24437: - Do you have dynamic allocation enabled? > Memory leak in

[jira] [Commented] (SPARK-24712) TrainValidationSplit ignores label column name and forces to be "label"

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529746#comment-16529746 ] Marco Gaido commented on SPARK-24712: - The problem is that you have not set the label on the

[jira] [Resolved] (SPARK-24712) TrainValidationSplit ignores label column name and forces to be "label"

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-24712. - Resolution: Not A Problem > TrainValidationSplit ignores label column name and forces to be

[jira] [Created] (SPARK-24660) SHS is not showing properly errors when downloading logs

2018-06-26 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24660: --- Summary: SHS is not showing properly errors when downloading logs Key: SPARK-24660 URL: https://issues.apache.org/jira/browse/SPARK-24660 Project: Spark Issue

[jira] [Commented] (SPARK-24208) Cannot resolve column in self join after applying Pandas UDF

2018-06-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525210#comment-16525210 ] Marco Gaido commented on SPARK-24208: - I think this may be a duplicate of SPARK-24373. Can you try

[jira] [Commented] (SPARK-24719) ClusteringEvaluator supports integer type labels

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530493#comment-16530493 ] Marco Gaido commented on SPARK-24719: - [~mengxr] I tried to pass integer values in the prediction

[jira] [Commented] (SPARK-24719) ClusteringEvaluator supports integer type labels

2018-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530264#comment-16530264 ] Marco Gaido commented on SPARK-24719: - Sure,thanks. I'll submit a PR ASAP. > ClusteringEvaluator

[jira] [Created] (SPARK-24149) Automatic namespaces discovery in HDFS federation

2018-05-02 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-24149: --- Summary: Automatic namespaces discovery in HDFS federation Key: SPARK-24149 URL: https://issues.apache.org/jira/browse/SPARK-24149 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24170) [Spark SQL] json file format is not dropped after dropping table

2018-05-03 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462545#comment-16462545 ] Marco Gaido commented on SPARK-24170: - This is true for every datasource. This is the expected

[jira] [Commented] (SPARK-22307) NOT condition working incorrectly

2017-10-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211239#comment-16211239 ] Marco Gaido commented on SPARK-22307: - Have you checked if the missing records contain null as a

[jira] [Resolved] (SPARK-22352) task failures with java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE error

2017-10-27 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22352. - Resolution: Duplicate This is a duplicate of SPARK-6235. > task failures with

[jira] [Commented] (SPARK-22946) Recursive withColumn calls cause org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2018-01-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320402#comment-16320402 ] Marco Gaido commented on SPARK-22946: - I am unable to reproduce on master. If I remember correctly,

<    1   2   3   4   5   6   7   >