[jira] [Commented] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114532#comment-16114532 ] Liang-Chi Hsieh commented on SPARK-21631: - I've not tried. But from the building code, seems you

[jira] [Comment Edited] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114044#comment-16114044 ] Liang-Chi Hsieh edited comment on SPARK-21631 at 8/4/17 7:27 AM: - I can

[jira] [Comment Edited] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114044#comment-16114044 ] Liang-Chi Hsieh edited comment on SPARK-21631 at 8/4/17 7:27 AM: - I can

[jira] [Commented] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114044#comment-16114044 ] Liang-Chi Hsieh commented on SPARK-21631: - I can make the same error by violating scala style

[jira] [Commented] (SPARK-21627) analyze hive table compute stats for columns with mixed case exception

2017-08-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113778#comment-16113778 ] Liang-Chi Hsieh commented on SPARK-21627: - I think it is just solved by SPARK-21599. > analyze

[jira] [Commented] (SPARK-21630) Pmod should not throw a divide by zero exception

2017-08-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113777#comment-16113777 ] Liang-Chi Hsieh commented on SPARK-21630: - Maybe duplicate to SPARK-21205? > Pmod should not

[jira] [Commented] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113775#comment-16113775 ] Liang-Chi Hsieh commented on SPARK-21631: - Looks like just it is not compliant with Spark code

[jira] [Commented] (SPARK-21591) Implement treeAggregate on Dataset API

2017-08-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110877#comment-16110877 ] Liang-Chi Hsieh commented on SPARK-21591: - [~yanboliang] Thanks for linking to the related JIRA.

[jira] [Commented] (SPARK-21567) Dataset with Tuple of type alias throws error

2017-08-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110432#comment-16110432 ] Liang-Chi Hsieh commented on SPARK-21567: - [~kretes] I've made a mistake when trying to solve

[jira] [Commented] (SPARK-21591) Implement treeAggregate on Dataset API

2017-08-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110226#comment-16110226 ] Liang-Chi Hsieh commented on SPARK-21591: - IIUC, basically the aggregation in SparkSQL doesn't

[jira] [Comment Edited] (SPARK-21591) Implement treeAggregate on Dataset API

2017-08-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110169#comment-16110169 ] Liang-Chi Hsieh edited comment on SPARK-21591 at 8/2/17 2:41 AM: - The

[jira] [Comment Edited] (SPARK-21591) Implement treeAggregate on Dataset API

2017-08-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110169#comment-16110169 ] Liang-Chi Hsieh edited comment on SPARK-21591 at 8/2/17 2:41 AM: - The

[jira] [Commented] (SPARK-21591) Implement treeAggregate on Dataset API

2017-08-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110169#comment-16110169 ] Liang-Chi Hsieh commented on SPARK-21591: - The most straightforward way is similar to

[jira] [Commented] (SPARK-21582) DataFrame.withColumnRenamed cause huge performance overhead

2017-07-31 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108310#comment-16108310 ] Liang-Chi Hsieh commented on SPARK-21582: - Please call toDF API with the renamed column names. It

[jira] [Commented] (SPARK-21567) Dataset with Tuple of type alias throws error

2017-07-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106781#comment-16106781 ] Liang-Chi Hsieh commented on SPARK-21567: - If there is no more questions about this, we can close

[jira] [Resolved] (SPARK-21256) Add WithSQLConf to Catalyst Test

2017-07-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-21256. - Resolution: Fixed > Add WithSQLConf to Catalyst Test >

[jira] [Commented] (SPARK-21567) Dataset with Tuple of type alias throws error

2017-07-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106317#comment-16106317 ] Liang-Chi Hsieh commented on SPARK-21567: - I tried it. Seems Scala fails to find implicit encoder

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105919#comment-16105919 ] Liang-Chi Hsieh commented on SPARK-21274: - [~Tagar] I've tried the query on PostgreSQL, the

[jira] [Commented] (SPARK-21555) GROUP BY don't work with expressions with NVL and nested objects

2017-07-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105092#comment-16105092 ] Liang-Chi Hsieh commented on SPARK-21555: - The sync between PR and JIRA seems broken still. I

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-27 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102969#comment-16102969 ] Liang-Chi Hsieh commented on SPARK-21274: - [~Tagar] Is the rewrite of INTERSECT ALL correct?

[jira] [Commented] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-07-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098344#comment-16098344 ] Liang-Chi Hsieh commented on SPARK-21177: - [~hyukjin.kwon] I ran spark-shell and your code

[jira] [Resolved] (SPARK-20754) Add Function Alias For MOD/TRUNCT/POSITION

2017-07-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-20754. - Resolution: Fixed > Add Function Alias For MOD/TRUNCT/POSITION >

[jira] [Resolved] (SPARK-21102) Refresh command is too aggressive in parsing

2017-07-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-21102. - Resolution: Fixed > Refresh command is too aggressive in parsing >

[jira] [Commented] (SPARK-21513) SQL to_json should support all column types

2017-07-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097931#comment-16097931 ] Liang-Chi Hsieh commented on SPARK-21513: - Thanks a lot. Note that as I have not permission to

[jira] [Commented] (SPARK-21513) SQL to_json should support all column types

2017-07-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097922#comment-16097922 ] Liang-Chi Hsieh commented on SPARK-21513: - If for scala part only, it seems a starter. Should we

[jira] [Updated] (SPARK-21497) Pull non-deterministic joining keys from Join operator

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21497: Description: Currently SparkSQL doesn't support non-deterministic joining conditions in

[jira] [Created] (SPARK-21497) Pull non-deterministic joining keys from Join operator

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21497: --- Summary: Pull non-deterministic joining keys from Join operator Key: SPARK-21497 URL: https://issues.apache.org/jira/browse/SPARK-21497 Project: Spark

[jira] [Comment Edited] (SPARK-21486) Fail when using aliased column of a aliased table from a subquery

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095689#comment-16095689 ] Liang-Chi Hsieh edited comment on SPARK-21486 at 7/21/17 2:34 AM: -- Since

[jira] [Commented] (SPARK-21486) Fail when using aliased column of a aliased table from a subquery

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095689#comment-16095689 ] Liang-Chi Hsieh commented on SPARK-21486: - Since 2.2.0, it is not allowed to use the qualifier

[jira] [Updated] (SPARK-21484) Wrong query plans of Dataset after persist/unpersist

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21484: Description: After the cal of persist/unpersis, the query plans of a Dataset should be

[jira] [Updated] (SPARK-21484) Wrong query plans of Dataset after persist/unpersist

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21484: Summary: Wrong query plans of Dataset after persist/unpersist (was: Wrong query plans of

[jira] [Updated] (SPARK-21484) Wrong query plans of Dataset after persist/unperesist

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21484: Description: After the cal of persist/unpersis, the query plans of a Dataset should be

[jira] [Created] (SPARK-21484) Wrong query plans of Dataset after persist/unperesist

2017-07-20 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21484: --- Summary: Wrong query plans of Dataset after persist/unperesist Key: SPARK-21484 URL: https://issues.apache.org/jira/browse/SPARK-21484 Project: Spark

[jira] [Commented] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-07-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092643#comment-16092643 ] Liang-Chi Hsieh commented on SPARK-21177: - I can't reproduce the reported issue with the codes.

[jira] [Commented] (SPARK-21316) Dataset Union output is not consistent with the column sequence

2017-07-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092596#comment-16092596 ] Liang-Chi Hsieh commented on SPARK-21316: - As unionByName was merged, I think it can solve this

[jira] [Commented] (SPARK-21437) Java Keyword cannot be used in table schema

2017-07-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092589#comment-16092589 ] Liang-Chi Hsieh commented on SPARK-21437: - For reference, the PR adds this check:

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092586#comment-16092586 ] Liang-Chi Hsieh commented on SPARK-21439: - If there's no pr submitted and no one claims working

[jira] [Updated] (SPARK-21441) Incorrect Codegen in SortMergeJoinExec results failures in some cases

2017-07-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21441: Priority: Major (was: Critical) > Incorrect Codegen in SortMergeJoinExec results failures

[jira] [Commented] (SPARK-21441) Incorrect Codegen in SortMergeJoinExec results failures in some cases

2017-07-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092583#comment-16092583 ] Liang-Chi Hsieh commented on SPARK-21441: - Btw, I think the priority of this issue should not be

[jira] [Comment Edited] (SPARK-20703) Add an operator for writing data out

2017-07-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085017#comment-16085017 ] Liang-Chi Hsieh edited comment on SPARK-20703 at 7/13/17 5:28 AM: --

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-07-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085017#comment-16085017 ] Liang-Chi Hsieh commented on SPARK-20703: - Thanks [~ste...@apache.org] for voicing this. For the

[jira] [Commented] (SPARK-21316) Dataset Union output is not consistent with the column sequence

2017-07-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076031#comment-16076031 ] Liang-Chi Hsieh commented on SPARK-21316: - Seems there are users confusing about this. The

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073072#comment-16073072 ] Liang-Chi Hsieh commented on SPARK-21109: - I'm not arguing anything...I just explain why the

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073053#comment-16073053 ] Liang-Chi Hsieh commented on SPARK-21109: - Oh. I see. The document is fixed recently by

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073047#comment-16073047 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 2:59 AM: - You

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073047#comment-16073047 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 2:55 AM: - You

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073048#comment-16073048 ] Liang-Chi Hsieh commented on SPARK-21109: - Not sure why the generated doc doesn't show it. But

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073047#comment-16073047 ] Liang-Chi Hsieh commented on SPARK-21109: - You claim that they have the same schema if we print

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073005#comment-16073005 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 1:29 AM: - They

[jira] [Comment Edited] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073005#comment-16073005 ] Liang-Chi Hsieh edited comment on SPARK-21109 at 7/4/17 1:29 AM: - They

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-07-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073005#comment-16073005 ] Liang-Chi Hsieh commented on SPARK-21109: - They have the same schema? Let's print schema on the

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071931#comment-16071931 ] Liang-Chi Hsieh commented on SPARK-21198: - I'd close this for now. If you have further question,

[jira] [Closed] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-21198. --- Resolution: Won't Fix > SparkSession catalog is terribly slow >

[jira] [Commented] (SPARK-21277) Spark is invoking an incorrect serializer after UDAF completion

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071929#comment-16071929 ] Liang-Chi Hsieh commented on SPARK-21277: - The call to {{InternalRow.getArray}} returns an

[jira] [Comment Edited] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071897#comment-16071897 ] Liang-Chi Hsieh edited comment on SPARK-21198 at 7/3/17 4:14 AM: - I mean

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071897#comment-16071897 ] Liang-Chi Hsieh commented on SPARK-21198: - I mean using {{show tables}} and {{show databases}}.

[jira] [Comment Edited] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071897#comment-16071897 ] Liang-Chi Hsieh edited comment on SPARK-21198 at 7/3/17 4:13 AM: - I mean

[jira] [Comment Edited] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071876#comment-16071876 ] Liang-Chi Hsieh edited comment on SPARK-21198 at 7/3/17 3:09 AM: - Since

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071876#comment-16071876 ] Liang-Chi Hsieh commented on SPARK-21198: - Since you can use {{sql("show tables")}} to get the

[jira] [Comment Edited] (SPARK-21198) SparkSession catalog is terribly slow

2017-07-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071876#comment-16071876 ] Liang-Chi Hsieh edited comment on SPARK-21198 at 7/3/17 3:08 AM: - Since

[jira] [Resolved] (SPARK-20953) Add hash map metrics to aggregate and join

2017-06-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-20953. - Resolution: Fixed > Add hash map metrics to aggregate and join >

[jira] [Updated] (SPARK-20690) Subqueries in FROM should have alias names

2017-06-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20690: Description: We add missing attributes into Filter in Analyzer. But we shouldn't do it

[jira] [Updated] (SPARK-20690) Subqueries in FROM should have alias names

2017-06-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20690: Summary: Subqueries in FROM should have alias names (was: Analyzer shouldn't add missing

[jira] [Comment Edited] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062529#comment-16062529 ] Liang-Chi Hsieh edited comment on SPARK-19104 at 6/26/17 4:37 AM: -- Just

[jira] [Commented] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062529#comment-16062529 ] Liang-Chi Hsieh commented on SPARK-19104: - Just found this issue. I proposed a PR to fix it.

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062499#comment-16062499 ] Liang-Chi Hsieh commented on SPARK-21198: - Thanks, [~revolucion09]. Can you measure how much time

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062489#comment-16062489 ] Liang-Chi Hsieh commented on SPARK-21198: - [~revolucion09] Any update? Because it involves

[jira] [Commented] (SPARK-21204) RuntimeException with Set and Case Class in Spark 2.1.1

2017-06-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062488#comment-16062488 ] Liang-Chi Hsieh commented on SPARK-21204: - [~maropu] I think It's more likely to turn to use set

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062194#comment-16062194 ] Liang-Chi Hsieh commented on SPARK-21198: - Have you profile the time spent in your application?

[jira] [Updated] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21198: Component/s: (was: Spark Core) SQL > SparkSession catalog is terribly

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061977#comment-16061977 ] Liang-Chi Hsieh commented on SPARK-21198: - Btw, [~revolucion09] may I ask the number of tables in

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061901#comment-16061901 ] Liang-Chi Hsieh commented on SPARK-21198: - Btw, I personally think it's not a bug, but an

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061900#comment-16061900 ] Liang-Chi Hsieh commented on SPARK-21198: - I will submit a PR for this soon. > SparkSession

[jira] [Commented] (SPARK-21198) SparkSession catalog is terribly slow

2017-06-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061899#comment-16061899 ] Liang-Chi Hsieh commented on SPARK-21198: - {{CatalogImpl.listTables}} actually returns the

[jira] [Closed] (SPARK-21134) Codegen-only expressions should not be collapsed with upper CodegenFallback expression

2017-06-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-21134. --- Resolution: Won't Fix > Codegen-only expressions should not be collapsed with upper

[jira] [Commented] (SPARK-21134) Codegen-only expressions should not be collapsed with upper CodegenFallback expression

2017-06-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055499#comment-16055499 ] Liang-Chi Hsieh commented on SPARK-21134: - Due to the discussion on the PR, we may remove

[jira] [Comment Edited] (SPARK-21101) Error running Hive temporary UDTF on latest Spark 2.2

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053647#comment-16053647 ] Liang-Chi Hsieh edited comment on SPARK-21101 at 6/19/17 9:01 AM: -- -May

[jira] [Commented] (SPARK-21101) Error running Hive temporary UDTF on latest Spark 2.2

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053647#comment-16053647 ] Liang-Chi Hsieh commented on SPARK-21101: - May I ask what Hive version your UDTF is based on? >

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053564#comment-16053564 ] Liang-Chi Hsieh commented on SPARK-21109: - Btw, there is a related ticket SPARK-21043 which is

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053563#comment-16053563 ] Liang-Chi Hsieh commented on SPARK-21109: - So if you don't have more comments on this, I'd think

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053559#comment-16053559 ] Liang-Chi Hsieh commented on SPARK-21109: - Specifically, both data1 and data2 are the same type

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053551#comment-16053551 ] Liang-Chi Hsieh commented on SPARK-21109: - The {{Dataset.union}} method has the following

[jira] [Created] (SPARK-21134) Codegen-only expressions should not be collapsed with upper CodegenFallback expression

2017-06-18 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21134: --- Summary: Codegen-only expressions should not be collapsed with upper CodegenFallback expression Key: SPARK-21134 URL: https://issues.apache.org/jira/browse/SPARK-21134

[jira] [Commented] (SPARK-21052) Add hash map metrics to join

2017-06-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045769#comment-16045769 ] Liang-Chi Hsieh commented on SPARK-21052: - I'll submit a PR for this soon. > Add hash map

[jira] [Updated] (SPARK-21052) Add hash map metrics to join

2017-06-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21052: Description: We should add avg hash map probe metric to join operator and report it on UI.

[jira] [Updated] (SPARK-21051) Add hash map metrics to aggregate

2017-06-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21051: Description: We should add avg hash map probe metric to aggregate operator and report it

[jira] [Created] (SPARK-21052) Add hash map metrics to join

2017-06-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21052: --- Summary: Add hash map metrics to join Key: SPARK-21052 URL: https://issues.apache.org/jira/browse/SPARK-21052 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-21051) Add hash map metrics to aggregate

2017-06-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21051: --- Summary: Add hash map metrics to aggregate Key: SPARK-21051 URL: https://issues.apache.org/jira/browse/SPARK-21051 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-21001) Staging folders from Hive table are not being cleared.

2017-06-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043869#comment-16043869 ] Liang-Chi Hsieh commented on SPARK-21001: - No, I mean the current 2.0 branch in git. I think

[jira] [Comment Edited] (SPARK-20953) Add hash map metrics to aggregate and join

2017-06-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042419#comment-16042419 ] Liang-Chi Hsieh edited comment on SPARK-20953 at 6/8/17 8:50 AM: - [~rxin]

[jira] [Commented] (SPARK-20953) Add hash map metrics to aggregate and join

2017-06-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042419#comment-16042419 ] Liang-Chi Hsieh commented on SPARK-20953: - [~rxin] Are we just going to log an error for too

[jira] [Commented] (SPARK-21011) RDD filter can combine/corrupt columns

2017-06-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042337#comment-16042337 ] Liang-Chi Hsieh commented on SPARK-21011: - I can't re-produce it on current codebase. Because you

[jira] [Commented] (SPARK-20998) BroadcastHashJoin producing wrong results

2017-06-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042321#comment-16042321 ] Liang-Chi Hsieh commented on SPARK-20998: - I've tried on current codebase and can't reproduce it.

[jira] [Comment Edited] (SPARK-21001) Staging folders from Hive table are not being cleared.

2017-06-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040341#comment-16040341 ] Liang-Chi Hsieh edited comment on SPARK-21001 at 6/8/17 5:20 AM: - I found

[jira] [Commented] (SPARK-21001) Staging folders from Hive table are not being cleared.

2017-06-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040341#comment-16040341 ] Liang-Chi Hsieh commented on SPARK-21001: - I found a PR which backports the related fix to 2.0

[jira] [Commented] (SPARK-20998) BroadcastHashJoin producing wrong results

2017-06-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040321#comment-16040321 ] Liang-Chi Hsieh commented on SPARK-20998: - Can you provide a sample data to reproduce this issue?

[jira] [Commented] (SPARK-21002) Syntax error regression when creating Hive storage handlers on Spark shell

2017-06-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040057#comment-16040057 ] Liang-Chi Hsieh commented on SPARK-21002: - This is duplicate to SPARK-19360. > Syntax error

[jira] [Commented] (SPARK-21002) Syntax error regression when creating Hive storage handlers on Spark shell

2017-06-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040055#comment-16040055 ] Liang-Chi Hsieh commented on SPARK-21002: - Seems that we don't support {{STORED BY

[jira] [Commented] (SPARK-20969) last() aggregate function fails returning the right answer with ordered windows

2017-06-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039862#comment-16039862 ] Liang-Chi Hsieh commented on SPARK-20969: - [~pletelli] I don't find an api doc for that. Maybe we

[jira] [Commented] (SPARK-20969) last() aggregate function fails returning the right answer with ordered windows

2017-06-05 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036690#comment-16036690 ] Liang-Chi Hsieh commented on SPARK-20969: - It seems to me that the second result isn't the same

<    2   3   4   5   6   7   8   9   10   11   >