[jira] [Commented] (SPARK-20953) Add hash map metrics to aggregate and join

2017-06-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033891#comment-16033891 ] Liang-Chi Hsieh commented on SPARK-20953: - [~rxin] Yeah, thanks for pinging me. I'll look into

[jira] [Commented] (SPARK-20916) Improve error message for unaliased subqueries in FROM clause

2017-05-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028694#comment-16028694 ] Liang-Chi Hsieh commented on SPARK-20916: - I will look into this. Thanks. > Improve error

[jira] [Commented] (SPARK-20848) Dangling threads when reading parquet files in local mode

2017-05-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021237#comment-16021237 ] Liang-Chi Hsieh commented on SPARK-20848: - Ok. It seems better not to change the concurrency, I

[jira] [Commented] (SPARK-20848) Dangling threads when reading parquet files in local mode

2017-05-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021166#comment-16021166 ] Liang-Chi Hsieh commented on SPARK-20848: - It seems to me that to share the task support between

[jira] [Commented] (SPARK-20848) Dangling threads when reading parquet files in local mode

2017-05-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020966#comment-16020966 ] Liang-Chi Hsieh commented on SPARK-20848: - I am looking up this. Thanks [~sowen] for pinging me.

[jira] [Commented] (SPARK-20740) Expose UserDefinedType make sure could extends it

2017-05-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018358#comment-16018358 ] Liang-Chi Hsieh commented on SPARK-20740: - This is duplicate to SPARK-7768. > Expose

[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2017-05-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015852#comment-16015852 ] Liang-Chi Hsieh commented on SPARK-17867: - The above example code can't compile with current

[jira] [Comment Edited] (SPARK-20703) Add an operator for writing data out

2017-05-15 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010021#comment-16010021 ] Liang-Chi Hsieh edited comment on SPARK-20703 at 5/15/17 2:21 PM: --

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-05-15 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010021#comment-16010021 ] Liang-Chi Hsieh commented on SPARK-20703: - [~tejasp] * It is a physical plan. Currently it is

[jira] [Comment Edited] (SPARK-20703) Add an operator for writing data out

2017-05-14 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009966#comment-16009966 ] Liang-Chi Hsieh edited comment on SPARK-20703 at 5/15/17 3:56 AM: -- I've

[jira] [Comment Edited] (SPARK-20703) Add an operator for writing data out

2017-05-14 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009966#comment-16009966 ] Liang-Chi Hsieh edited comment on SPARK-20703 at 5/15/17 3:48 AM: -- I've

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-05-14 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009966#comment-16009966 ] Liang-Chi Hsieh commented on SPARK-20703: - I've done something locally. Currently I wrap the

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-05-11 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006070#comment-16006070 ] Liang-Chi Hsieh commented on SPARK-20703: - Does "writing data out" mean writing data out through

[jira] [Commented] (SPARK-12225) Support adding or replacing multiple columns at once in DataFrame API

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005904#comment-16005904 ] Liang-Chi Hsieh commented on SPARK-12225: - Without knowing this issue, I've implemented a

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005780#comment-16005780 ] Liang-Chi Hsieh commented on SPARK-20703: - [~rxin] Thanks for ping me. Sure. I'd love to take

[jira] [Updated] (SPARK-20690) Analyzer shouldn't add missing attributes through subquery

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20690: Description: We add missing attributes into Filter in Analyzer. But we shouldn't do it

[jira] [Updated] (SPARK-20690) Analyzer shouldn't add missing attributes through subquery

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20690: Description: We add missing attributes into Filter in Analyzer. But we shouldn't do it

[jira] [Created] (SPARK-20690) Analyzer shouldn't add missing attributes through subquery

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-20690: --- Summary: Analyzer shouldn't add missing attributes through subquery Key: SPARK-20690 URL: https://issues.apache.org/jira/browse/SPARK-20690 Project: Spark

[jira] [Closed] (SPARK-20612) Unresolvable attribute in Filter won't throw analysis exception

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-20612. --- Resolution: Won't Fix > Unresolvable attribute in Filter won't throw analysis exception >

[jira] [Created] (SPARK-20612) Unresolvable attribute in Filter won't throw analysis exception

2017-05-05 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-20612: --- Summary: Unresolvable attribute in Filter won't throw analysis exception Key: SPARK-20612 URL: https://issues.apache.org/jira/browse/SPARK-20612 Project: Spark

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990548#comment-15990548 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] I created SPARK-20542 to track the

[jira] [Created] (SPARK-20542) Add an API into Bucketizer that can bin a lot of columns all at once

2017-04-30 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-20542: --- Summary: Add an API into Bucketizer that can bin a lot of columns all at once Key: SPARK-20542 URL: https://issues.apache.org/jira/browse/SPARK-20542 Project:

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988892#comment-15988892 ] Liang-Chi Hsieh commented on SPARK-20392: - Yeah, I have the same concern that the time still

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984270#comment-15984270 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] Btw, the time applying the model_9756

[jira] [Comment Edited] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh edited comment on SPARK-20392 at 4/26/17 6:44 AM: --

[jira] [Comment Edited] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh edited comment on SPARK-20392 at 4/26/17 6:43 AM: --

[jira] [Comment Edited] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh edited comment on SPARK-20392 at 4/26/17 6:43 AM: --

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984194#comment-15984194 ] Liang-Chi Hsieh commented on SPARK-20392: - By disabling

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] Currently I think the performance

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15980735#comment-15980735 ] Liang-Chi Hsieh commented on SPARK-20392: - And Is it possible to attach the dataset that has

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15980732#comment-15980732 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] You mentioned similar pipelines run

[jira] [Updated] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20399: Description: The new SQL parser is introduced into Spark 2.0. Seems it bring an issue

[jira] [Commented] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15980603#comment-15980603 ] Liang-Chi Hsieh commented on SPARK-20399: - [~hvanhovell] Do you think this is a regression we

[jira] [Updated] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20399: Description: The new SQL parser is introduced into Spark 2.0. Seems it bring an issue

[jira] [Updated] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20399: Description: The new SQL parser is introduced into Spark 2.0. Seems it bring an issue

[jira] [Comment Edited] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976047#comment-15976047 ] Liang-Chi Hsieh edited comment on SPARK-20399 at 4/20/17 4:28 AM: -- I

[jira] [Updated] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-20399: Description: The new SQL parser is introduced into Spark 2.0. Seems it bring an issue

[jira] [Commented] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976047#comment-15976047 ] Liang-Chi Hsieh commented on SPARK-20399: - I already have the fix for this. I am not sure if

[jira] [Created] (SPARK-20399) Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser

2017-04-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-20399: --- Summary: Can't use same regex pattern between 1.6 and 2.x due to unescaped sql string in parser Key: SPARK-20399 URL: https://issues.apache.org/jira/browse/SPARK-20399

[jira] [Commented] (SPARK-20356) Spark sql group by returns incorrect results after join + distinct transformations

2017-04-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973936#comment-15973936 ] Liang-Chi Hsieh commented on SPARK-20356: - [~dkbiswal] Yeah, right. Thanks. We need to force the

[jira] [Commented] (SPARK-20356) Spark sql group by returns incorrect results after join + distinct transformations

2017-04-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973928#comment-15973928 ] Liang-Chi Hsieh commented on SPARK-20356: - I think I found the reason of the issue. I am working

[jira] [Comment Edited] (SPARK-20356) Spark sql group by returns incorrect results after join + distinct transformations

2017-04-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973902#comment-15973902 ] Liang-Chi Hsieh edited comment on SPARK-20356 at 4/19/17 2:05 AM: --

[jira] [Commented] (SPARK-20356) Spark sql group by returns incorrect results after join + distinct transformations

2017-04-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973902#comment-15973902 ] Liang-Chi Hsieh commented on SPARK-20356: - [~hvanhovell] I can't reproduce it with your example

[jira] [Commented] (SPARK-20356) Spark sql group by returns incorrect results after join + distinct transformations

2017-04-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973793#comment-15973793 ] Liang-Chi Hsieh commented on SPARK-20356: - [~dkbiswal] Thanks for pinging me. I will look into

[jira] [Commented] (SPARK-20292) string representation of TreeNode is messy

2017-04-11 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964154#comment-15964154 ] Liang-Chi Hsieh commented on SPARK-20292: - I will look into this. > string representation of

[jira] [Commented] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960835#comment-15960835 ] Liang-Chi Hsieh commented on SPARK-20226: - How many columns are added in above runs? I didn't see

[jira] [Commented] (SPARK-20246) Should check determinism when pushing predicates down through aggregation

2017-04-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960325#comment-15960325 ] Liang-Chi Hsieh commented on SPARK-20246: - For union and window, we don't have the replacement of

[jira] [Commented] (SPARK-20246) Should check determinism when pushing predicates down through aggregation

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960300#comment-15960300 ] Liang-Chi Hsieh commented on SPARK-20246: - We should also check determinism of the [replaced

[jira] [Commented] (SPARK-20246) Should check determinism when pushing predicates down through aggregation

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960287#comment-15960287 ] Liang-Chi Hsieh commented on SPARK-20246: - Seems we have checked determinism in

[jira] [Comment Edited] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960001#comment-15960001 ] Liang-Chi Hsieh edited comment on SPARK-20226 at 4/7/17 5:19 AM: -

[jira] [Comment Edited] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960001#comment-15960001 ] Liang-Chi Hsieh edited comment on SPARK-20226 at 4/7/17 5:19 AM: -

[jira] [Comment Edited] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960001#comment-15960001 ] Liang-Chi Hsieh edited comment on SPARK-20226 at 4/6/17 11:59 PM: --

[jira] [Commented] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960001#comment-15960001 ] Liang-Chi Hsieh commented on SPARK-20226: - {{spark.sql.constraintPropagation.enabled}} is a SQL

[jira] [Commented] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-06 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959072#comment-15959072 ] Liang-Chi Hsieh commented on SPARK-20226: - I am not sure what the job-server local.conf is. Does

[jira] [Commented] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-04-05 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958259#comment-15958259 ] Liang-Chi Hsieh commented on SPARK-20226: - [~barrybecker4] Can you try to disable this config

[jira] [Commented] (SPARK-20214) pyspark.mllib SciPyTests test_serialize

2017-04-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956237#comment-15956237 ] Liang-Chi Hsieh commented on SPARK-20214: - Confirmed that dok_matrix.tocsc() won't guarantee

[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.

2017-04-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954584#comment-15954584 ] Liang-Chi Hsieh commented on SPARK-20193: - Actually I am not sure what {{struct()}} represents.

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-04-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954559#comment-15954559 ] Liang-Chi Hsieh commented on SPARK-20144: - I don't think the API has the guarantee about the data

[jira] [Closed] (SPARK-19443) The function to generate constraints takes too long when the query plan grows continuously

2017-03-31 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-19443. --- Resolution: Won't Fix > The function to generate constraints takes too long when the query

[jira] [Closed] (SPARK-19665) Improve constraint propagation

2017-03-31 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-19665. --- Resolution: Won't Fix > Improve constraint propagation > -- > >

[jira] [Created] (SPARK-20175) Exists should not be evaluated in Join operator and can be converted to ScalarSubquery if no correlated reference

2017-03-31 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-20175: --- Summary: Exists should not be evaluated in Join operator and can be converted to ScalarSubquery if no correlated reference Key: SPARK-20175 URL:

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939625#comment-15939625 ] Liang-Chi Hsieh commented on SPARK-14083: - [~maropu] Thanks! That's great! > Analyze JVM

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937908#comment-15937908 ] Liang-Chi Hsieh commented on SPARK-14083: - Yeah. Maybe I wrongly read your comment above. I

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937888#comment-15937888 ] Liang-Chi Hsieh commented on SPARK-14083: - Hmm, I am not sure if the current status of the branch

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937865#comment-15937865 ] Liang-Chi Hsieh commented on SPARK-14083: - [~kiszk] Can you update the branch? So I can send PR

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937796#comment-15937796 ] Liang-Chi Hsieh commented on SPARK-14083: - Great. I am not sure if it is ready for a (WIP) PR. I

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-22 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937754#comment-15937754 ] Liang-Chi Hsieh commented on SPARK-14083: - [~kiszk] Thanks for rebasing it. It is more convenient

[jira] [Commented] (SPARK-16060) Vectorized Orc reader

2017-03-22 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937711#comment-15937711 ] Liang-Chi Hsieh commented on SPARK-16060: - cc [~rxin] If the approach based on Hive package is

[jira] [Commented] (SPARK-17556) Executor side broadcast for broadcast joins

2017-03-22 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937707#comment-15937707 ] Liang-Chi Hsieh commented on SPARK-17556: - We may need to change the Target Version/s for this.

[jira] [Commented] (SPARK-19468) Dataset slow because of unnecessary shuffles

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906827#comment-15906827 ] Liang-Chi Hsieh commented on SPARK-19468: - We need a holistic solution for this issue. I

[jira] [Created] (SPARK-19931) InMemoryTableScanExec should rewrite output partitioning and ordering when aliasing output attributes

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19931: --- Summary: InMemoryTableScanExec should rewrite output partitioning and ordering when aliasing output attributes Key: SPARK-19931 URL:

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906773#comment-15906773 ] Liang-Chi Hsieh commented on SPARK-18281: - That is right. So you can try 2.1.1 or latest codebase

[jira] [Issue Comment Deleted] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-18281: Comment: was deleted (was: That is right. Btw, I think [~sowen] means "not 2.1.1. 2.0.3

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906772#comment-15906772 ] Liang-Chi Hsieh commented on SPARK-18281: - That is right. Btw, I think [~sowen] means "not 2.1.1.

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906523#comment-15906523 ] Liang-Chi Hsieh commented on SPARK-18281: - Oh. Btw, you can see the Fix Version/s of this JIRA is

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906521#comment-15906521 ] Liang-Chi Hsieh commented on SPARK-18281: - [~lebigot] Thanks for the error log! It is weird

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906519#comment-15906519 ] Liang-Chi Hsieh commented on SPARK-18281: - Besides, can you also provide the error log? >

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906499#comment-15906499 ] Liang-Chi Hsieh commented on SPARK-18281: - Or you have other reproducible examples to test? >

[jira] [Commented] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906498#comment-15906498 ] Liang-Chi Hsieh commented on SPARK-18281: - Can you provide some info about your environment? Few

[jira] [Comment Edited] (SPARK-18281) toLocalIterator yields time out error on pyspark2

2017-03-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906498#comment-15906498 ] Liang-Chi Hsieh edited comment on SPARK-18281 at 3/12/17 11:41 AM: --- Can

[jira] [Created] (SPARK-19902) Support more expression canonicalization: Add, Subtract, Multiply and Divide

2017-03-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19902: --- Summary: Support more expression canonicalization: Add, Subtract, Multiply and Divide Key: SPARK-19902 URL: https://issues.apache.org/jira/browse/SPARK-19902

[jira] [Commented] (SPARK-18667) input_file_name function does not work with UDF

2017-03-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901426#comment-15901426 ] Liang-Chi Hsieh commented on SPARK-18667: - I already created another JIRA SPARK-19223 for the

[jira] [Created] (SPARK-19846) Add a flag to disable constraint propagation

2017-03-06 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19846: --- Summary: Add a flag to disable constraint propagation Key: SPARK-19846 URL: https://issues.apache.org/jira/browse/SPARK-19846 Project: Spark Issue

[jira] [Commented] (SPARK-19752) OrcGetSplits fails with 0 size files

2017-03-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890151#comment-15890151 ] Liang-Chi Hsieh commented on SPARK-19752: - Do you have a short example code that can reproduce

[jira] [Commented] (SPARK-19752) OrcGetSplits fails with 0 size files

2017-02-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889616#comment-15889616 ] Liang-Chi Hsieh commented on SPARK-19752: - >From the log, looks like it is a problem in Hive? >

[jira] [Commented] (SPARK-15678) Not use cache on appends and overwrites

2017-02-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884061#comment-15884061 ] Liang-Chi Hsieh commented on SPARK-15678: - [~kiszk][~gen] I created SPARK-19736 for the reported

[jira] [Created] (SPARK-19736) refreshByPath should clear all cached plans with the specified path

2017-02-24 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19736: --- Summary: refreshByPath should clear all cached plans with the specified path Key: SPARK-19736 URL: https://issues.apache.org/jira/browse/SPARK-19736 Project:

[jira] [Commented] (SPARK-19352) Sorting issues on relatively big datasets

2017-02-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884016#comment-15884016 ] Liang-Chi Hsieh commented on SPARK-19352: - I think this is in fact solved by SPARK-19563.

[jira] [Created] (SPARK-19665) Improve constraint propagation

2017-02-20 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19665: --- Summary: Improve constraint propagation Key: SPARK-19665 URL: https://issues.apache.org/jira/browse/SPARK-19665 Project: Spark Issue Type: Improvement

[jira] [Closed] (SPARK-19530) Use guava weigher for code cache eviction

2017-02-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-19530. --- Resolution: Won't Fix > Use guava weigher for code cache eviction >

[jira] [Commented] (SPARK-19217) Offer easy cast from vector to array

2017-02-18 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873159#comment-15873159 ] Liang-Chi Hsieh commented on SPARK-19217: - The native casting of UserDefinedType from/to other

[jira] [Comment Edited] (SPARK-19653) `Vector` Type Should Be A First-Class Citizen In Spark SQL

2017-02-17 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872938#comment-15872938 ] Liang-Chi Hsieh edited comment on SPARK-19653 at 2/18/17 3:12 AM: --

[jira] [Commented] (SPARK-19653) `Vector` Type Should Be A First-Class Citizen In Spark SQL

2017-02-17 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872938#comment-15872938 ] Liang-Chi Hsieh commented on SPARK-19653: - Actually some Spark SQL functions like the mentioned

[jira] [Commented] (SPARK-19493) Remove Java 7 support

2017-02-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859166#comment-15859166 ] Liang-Chi Hsieh commented on SPARK-19493: - +1 > Remove Java 7 support > - >

[jira] [Created] (SPARK-19530) Use guava weigher for code cache eviction

2017-02-08 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19530: --- Summary: Use guava weigher for code cache eviction Key: SPARK-19530 URL: https://issues.apache.org/jira/browse/SPARK-19530 Project: Spark Issue Type:

[jira] [Updated] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-19508: Description: Utils provides a helper function to bind service on port. This function can

[jira] [Created] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19508: --- Summary: Improve error message when binding service fails Key: SPARK-19508 URL: https://issues.apache.org/jira/browse/SPARK-19508 Project: Spark Issue

[jira] [Closed] (SPARK-18824) Add optimizer rule to reorder expensive Filter predicates like ScalaUDF

2017-02-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-18824. --- Resolution: Won't Fix > Add optimizer rule to reorder expensive Filter predicates like

[jira] [Closed] (SPARK-15180) Support subexpression elimination in Fliter

2017-02-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-15180. --- Resolution: Won't Fix > Support subexpression elimination in Fliter >

[jira] [Commented] (SPARK-15180) Support subexpression elimination in Fliter

2017-02-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851491#comment-15851491 ] Liang-Chi Hsieh commented on SPARK-15180: - [~hyukjin.kwon] Yes. I resolved this. Thanks! >

<    3   4   5   6   7   8   9   10   11   12   >