[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150008#comment-15150008 ] Liang-Chi Hsieh commented on SPARK-1: - If you don't attach a partition id, wouldn't your each

[jira] [Created] (SPARK-13358) Retrieve grep path when doing Benchmark

2016-02-16 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-13358: --- Summary: Retrieve grep path when doing Benchmark Key: SPARK-13358 URL: https://issues.apache.org/jira/browse/SPARK-13358 Project: Spark Issue Type:

[jira] [Created] (SPARK-13321) Support nested UNION in parser

2016-02-14 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-13321: --- Summary: Support nested UNION in parser Key: SPARK-13321 URL: https://issues.apache.org/jira/browse/SPARK-13321 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-13173) Fail to load CSV file with NPE

2016-02-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138744#comment-15138744 ] Liang-Chi Hsieh commented on SPARK-13173: - With latest build, I can't reproduce this bug. > Fail

[jira] [Commented] (SPARK-13206) Subquery Alias in Hive Parser

2016-02-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133520#comment-15133520 ] Liang-Chi Hsieh commented on SPARK-13206: - I tried your query. Looks like it can be parsed. The

[jira] [Comment Edited] (SPARK-13206) Subquery Alias in Hive Parser

2016-02-04 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133520#comment-15133520 ] Liang-Chi Hsieh edited comment on SPARK-13206 at 2/5/16 2:14 AM: - I tried

[jira] [Commented] (SPARK-13139) Create native DDL commands

2016-02-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128142#comment-15128142 ] Liang-Chi Hsieh commented on SPARK-13139: - Yes. I would like to do this. > Create native DDL

[jira] [Commented] (SPARK-13087) Grouping by a complex expression may lead to incorrect AttributeReferences in aggregations

2016-02-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126012#comment-15126012 ] Liang-Chi Hsieh commented on SPARK-13087: - On latest build, looks like there is no this problem.

[jira] [Created] (SPARK-13113) Remove unnecessary bit operation when decoding page number

2016-02-01 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-13113: --- Summary: Remove unnecessary bit operation when decoding page number Key: SPARK-13113 URL: https://issues.apache.org/jira/browse/SPARK-13113 Project: Spark

[jira] [Commented] (SPARK-12940) Partition field in Spark SQL WHERE clause causing Exception

2016-01-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118627#comment-15118627 ] Liang-Chi Hsieh commented on SPARK-12940: - I ran your example with latest Spark build and it

[jira] [Commented] (SPARK-12852) Support create table DDL with bucketing

2016-01-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118550#comment-15118550 ] Liang-Chi Hsieh commented on SPARK-12852: - Is this for data source API or Hive specified bucket

[jira] [Comment Edited] (SPARK-12890) Spark SQL query related to only partition fields should not scan the whole data.

2016-01-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115148#comment-15115148 ] Liang-Chi Hsieh edited comment on SPARK-12890 at 1/25/16 12:46 PM: --- As

[jira] [Commented] (SPARK-12890) Spark SQL query related to only partition fields should not scan the whole data.

2016-01-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115148#comment-15115148 ] Liang-Chi Hsieh commented on SPARK-12890: - As {{DataFrame.parquet}} accepts paths as parameter,

[jira] [Commented] (SPARK-12890) Spark SQL query related to only partition fields should not scan the whole data.

2016-01-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115144#comment-15115144 ] Liang-Chi Hsieh commented on SPARK-12890: - For the original issue, I think it might because you

[jira] [Commented] (SPARK-12968) Implement command to set current database

2016-01-25 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115104#comment-15115104 ] Liang-Chi Hsieh commented on SPARK-12968: - I think I can work on this. [~hvanhovell] Since it is

[jira] [Commented] (SPARK-4878) driverPropsFetcher causes spurious Akka disassociate errors

2016-01-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114861#comment-15114861 ] Liang-Chi Hsieh commented on SPARK-4878: I think it is still alive and used. The above code sends

[jira] [Comment Edited] (SPARK-4878) driverPropsFetcher causes spurious Akka disassociate errors

2016-01-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114861#comment-15114861 ] Liang-Chi Hsieh edited comment on SPARK-4878 at 1/25/16 7:19 AM: - I think

[jira] [Comment Edited] (SPARK-4878) driverPropsFetcher causes spurious Akka disassociate errors

2016-01-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114861#comment-15114861 ] Liang-Chi Hsieh edited comment on SPARK-4878 at 1/25/16 7:17 AM: - I think

[jira] [Commented] (SPARK-12904) Strength reduction for integer/decimal comparisons

2016-01-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106491#comment-15106491 ] Liang-Chi Hsieh commented on SPARK-12904: - Yeah. I would like to do. Thanks! > Strength

[jira] [Commented] (SPARK-12904) Strength reduction for integer/decimal comparisons

2016-01-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107998#comment-15107998 ] Liang-Chi Hsieh commented on SPARK-12904: - The rules should be: 1. int_col > decimal_literal =>

[jira] [Commented] (SPARK-12904) Strength reduction for integer/decimal comparisons

2016-01-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108020#comment-15108020 ] Liang-Chi Hsieh commented on SPARK-12904: - And also: 5. decimal_literal > int_col =>

[jira] [Commented] (SPARK-12740) grouping()/grouping_id() should work with having and order by

2016-01-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15091418#comment-15091418 ] Liang-Chi Hsieh commented on SPARK-12740: - [~davies] Do we have the functions grouping and

[jira] [Created] (SPARK-12733) Remove duplicate codes in ProjectCollapsing

2016-01-08 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12733: --- Summary: Remove duplicate codes in ProjectCollapsing Key: SPARK-12733 URL: https://issues.apache.org/jira/browse/SPARK-12733 Project: Spark Issue

[jira] [Closed] (SPARK-12733) Remove duplicate codes in ProjectCollapsing

2016-01-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-12733. --- Resolution: Won't Fix > Remove duplicate codes in ProjectCollapsing >

[jira] [Comment Edited] (SPARK-12648) UDF with Option[Double] throws ClassCastException

2016-01-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090485#comment-15090485 ] Liang-Chi Hsieh edited comment on SPARK-12648 at 1/9/16 7:26 AM: - You

[jira] [Commented] (SPARK-12648) UDF with Option[Double] throws ClassCastException

2016-01-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090485#comment-15090485 ] Liang-Chi Hsieh commented on SPARK-12648: - You don't need to handle the null value. scala> val

[jira] [Created] (SPARK-12643) Set lib directory for antlr

2016-01-04 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12643: --- Summary: Set lib directory for antlr Key: SPARK-12643 URL: https://issues.apache.org/jira/browse/SPARK-12643 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-12542) Support intersect/except in SQL

2015-12-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073704#comment-15073704 ] Liang-Chi Hsieh commented on SPARK-12542: - I think we may need to wait for SPARK-12362 getting

[jira] [Comment Edited] (SPARK-12542) Support intersect/except in SQL

2015-12-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073705#comment-15073705 ] Liang-Chi Hsieh edited comment on SPARK-12542 at 12/29/15 9:24 AM: ---

[jira] [Commented] (SPARK-12542) Support intersect/except in SQL

2015-12-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073705#comment-15073705 ] Liang-Chi Hsieh commented on SPARK-12542: - Besides, SQLContext already support

[jira] [Commented] (SPARK-12542) Support intersect/except in SQL

2015-12-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073728#comment-15073728 ] Liang-Chi Hsieh commented on SPARK-12542: - [~davies]How do you think? > Support intersect/except

[jira] [Comment Edited] (SPARK-12542) Support intersect/except in SQL

2015-12-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073728#comment-15073728 ] Liang-Chi Hsieh edited comment on SPARK-12542 at 12/29/15 9:43 AM: ---

[jira] [Created] (SPARK-12448) Add UserDefinedType support to Cast

2015-12-21 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12448: --- Summary: Add UserDefinedType support to Cast Key: SPARK-12448 URL: https://issues.apache.org/jira/browse/SPARK-12448 Project: Spark Issue Type:

[jira] [Created] (SPARK-12445) Fix null exception when passing null as array in toCatalystArray

2015-12-20 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12445: --- Summary: Fix null exception when passing null as array in toCatalystArray Key: SPARK-12445 URL: https://issues.apache.org/jira/browse/SPARK-12445 Project:

[jira] [Created] (SPARK-12443) encoderFor should support Decimal

2015-12-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12443: --- Summary: encoderFor should support Decimal Key: SPARK-12443 URL: https://issues.apache.org/jira/browse/SPARK-12443 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-12439) Fix toCatalystArray and MapObjects

2015-12-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12439: --- Summary: Fix toCatalystArray and MapObjects Key: SPARK-12439 URL: https://issues.apache.org/jira/browse/SPARK-12439 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-12438) Add SQLUserDefinedType support for encoder

2015-12-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12438: --- Summary: Add SQLUserDefinedType support for encoder Key: SPARK-12438 URL: https://issues.apache.org/jira/browse/SPARK-12438 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-12276) Prevent RejectedExecutionException by checking if ThreadPoolExecutor is shutdown and its capacity

2015-12-11 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-12276. - Resolution: Won't Fix > Prevent RejectedExecutionException by checking if

[jira] [Commented] (SPARK-12231) Failed to generate predicate Error when using dropna

2015-12-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15050357#comment-15050357 ] Liang-Chi Hsieh commented on SPARK-12231: - I have opened a PR

[jira] [Created] (SPARK-12276) Prevent RejectedExecutionException by checking if ThreadPoolExecutor is shutdown and its capacity

2015-12-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12276: --- Summary: Prevent RejectedExecutionException by checking if ThreadPoolExecutor is shutdown and its capacity Key: SPARK-12276 URL:

[jira] [Closed] (SPARK-12203) Add KafkaDirectInputDStream that directly pulls messages from Kafka Brokers using receivers

2015-12-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-12203. --- Resolution: Won't Fix > Add KafkaDirectInputDStream that directly pulls messages from Kafka

[jira] [Commented] (SPARK-12203) Add KafkaDirectInputDStream that directly pulls messages from Kafka Brokers using receivers

2015-12-09 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048935#comment-15048935 ] Liang-Chi Hsieh commented on SPARK-12203: - Thanks for commenting. As I said on the PR, this is a

[jira] [Created] (SPARK-12203) Add KafkaDirectInputDStream that directly pulls messages from Kafka Brokers using receivers

2015-12-08 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12203: --- Summary: Add KafkaDirectInputDStream that directly pulls messages from Kafka Brokers using receivers Key: SPARK-12203 URL: https://issues.apache.org/jira/browse/SPARK-12203

[jira] [Commented] (SPARK-12203) Add KafkaDirectInputDStream that directly pulls messages from Kafka Brokers using receivers

2015-12-08 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046708#comment-15046708 ] Liang-Chi Hsieh commented on SPARK-12203: - Our need is to have exactly once feature of

[jira] [Commented] (SPARK-12117) Column Aliases are Ignored in callUDF while using struct()

2015-12-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041232#comment-15041232 ] Liang-Chi Hsieh commented on SPARK-12117: - Hmm, because the problem is due to the field names are

[jira] [Updated] (SPARK-12018) Refactor common subexpression elimination code

2015-11-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-12018: Description: The code of common subexpression elimination can be factored and simplified.

[jira] [Created] (SPARK-12018) Refactor common subexpression elimination code

2015-11-26 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-12018: --- Summary: Refactor common subexpression elimination code Key: SPARK-12018 URL: https://issues.apache.org/jira/browse/SPARK-12018 Project: Spark Issue

[jira] [Created] (SPARK-11955) Mark one side fields in merging schema for safely pushdowning filters in parquet

2015-11-24 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11955: --- Summary: Mark one side fields in merging schema for safely pushdowning filters in parquet Key: SPARK-11955 URL: https://issues.apache.org/jira/browse/SPARK-11955

[jira] [Closed] (SPARK-11915) Fix flaky python test pyspark.sql.group

2015-11-23 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-11915. --- Resolution: Not A Problem > Fix flaky python test pyspark.sql.group >

[jira] [Updated] (SPARK-11915) Fix flaky python test pyspark.sql.group

2015-11-22 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-11915: Description: The python test pyspark.sql.group will fail due to items' order in returned

[jira] [Created] (SPARK-11915) Fix flaky python test pyspark.sql.group

2015-11-22 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11915: --- Summary: Fix flaky python test pyspark.sql.group Key: SPARK-11915 URL: https://issues.apache.org/jira/browse/SPARK-11915 Project: Spark Issue Type:

[jira] [Created] (SPARK-11908) Add NullType support to RowEncoder

2015-11-22 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11908: --- Summary: Add NullType support to RowEncoder Key: SPARK-11908 URL: https://issues.apache.org/jira/browse/SPARK-11908 Project: Spark Issue Type:

[jira] [Commented] (SPARK-8233) misc function: hash

2015-11-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15020230#comment-15020230 ] Liang-Chi Hsieh commented on SPARK-8233: Sure. I will work on it. > misc function: hash >

[jira] [Created] (SPARK-11743) Add UserDefinedType support to RowEncoder

2015-11-13 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11743: --- Summary: Add UserDefinedType support to RowEncoder Key: SPARK-11743 URL: https://issues.apache.org/jira/browse/SPARK-11743 Project: Spark Issue Type:

[jira] [Commented] (SPARK-11698) Add option to ignore kafka messages that are out of limit rate

2015-11-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003478#comment-15003478 ] Liang-Chi Hsieh commented on SPARK-11698: - Yes, but it is intentional. We don't want to increase

[jira] [Comment Edited] (SPARK-11698) Add option to ignore kafka messages that are out of limit rate

2015-11-12 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003478#comment-15003478 ] Liang-Chi Hsieh edited comment on SPARK-11698 at 11/13/15 3:17 AM: ---

[jira] [Created] (SPARK-11698) Add option to ignore kafka messages that are out of limit rate

2015-11-12 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11698: --- Summary: Add option to ignore kafka messages that are out of limit rate Key: SPARK-11698 URL: https://issues.apache.org/jira/browse/SPARK-11698 Project: Spark

[jira] [Created] (SPARK-11593) Replace catalyst converter with RowEncoder in ScalaUDF

2015-11-09 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11593: --- Summary: Replace catalyst converter with RowEncoder in ScalaUDF Key: SPARK-11593 URL: https://issues.apache.org/jira/browse/SPARK-11593 Project: Spark

[jira] [Commented] (SPARK-11523) spark_partition_id() considered invalid function

2015-11-05 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993294#comment-14993294 ] Liang-Chi Hsieh commented on SPARK-11523: - That is because you use Spark UDF in Hive native

[jira] [Created] (SPARK-11448) We should skip caching part-files in ParquetRelation when configured to merge schema and respect summaries

2015-11-02 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11448: --- Summary: We should skip caching part-files in ParquetRelation when configured to merge schema and respect summaries Key: SPARK-11448 URL:

[jira] [Updated] (SPARK-11448) We should skip caching part-files in ParquetRelation when configured to merge schema and respect summaries

2015-11-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-11448: Description: We now cache part-files, metadata, common metadata in ParquetRelation as

[jira] [Created] (SPARK-11400) BroadcastNestedLoopJoin should support LeftSemi join

2015-10-29 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11400: --- Summary: BroadcastNestedLoopJoin should support LeftSemi join Key: SPARK-11400 URL: https://issues.apache.org/jira/browse/SPARK-11400 Project: Spark

[jira] [Closed] (SPARK-11341) Given non-zero ordinal toRow in the encoders of primitive types will cause problem

2015-10-27 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-11341. --- Resolution: Not A Problem > Given non-zero ordinal toRow in the encoders of primitive types

[jira] [Created] (SPARK-11363) LeftSemiJoin should be LeftSemi in SparkStrategies

2015-10-27 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11363: --- Summary: LeftSemiJoin should be LeftSemi in SparkStrategies Key: SPARK-11363 URL: https://issues.apache.org/jira/browse/SPARK-11363 Project: Spark

[jira] [Created] (SPARK-11362) Use Spark BitSet in BroadcastNestedLoopJoin

2015-10-27 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11362: --- Summary: Use Spark BitSet in BroadcastNestedLoopJoin Key: SPARK-11362 URL: https://issues.apache.org/jira/browse/SPARK-11362 Project: Spark Issue

[jira] [Created] (SPARK-11341) Given non-zero ordinal toRow in the encoders of primitive types will cause problem

2015-10-26 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11341: --- Summary: Given non-zero ordinal toRow in the encoders of primitive types will cause problem Key: SPARK-11341 URL: https://issues.apache.org/jira/browse/SPARK-11341

[jira] [Commented] (SPARK-9297) covar_pop and covar_samp aggregate functions

2015-10-20 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966084#comment-14966084 ] Liang-Chi Hsieh commented on SPARK-9297: Yes. I would like to take it. > covar_pop and covar_samp

[jira] [Commented] (SPARK-9162) Implement code generation for ScalaUDF

2015-10-16 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961693#comment-14961693 ] Liang-Chi Hsieh commented on SPARK-9162: [~rxin] Yes. I will submit a PR for this soon. >

[jira] [Created] (SPARK-11164) Add InSet pushdown filter back for Parquet

2015-10-16 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11164: --- Summary: Add InSet pushdown filter back for Parquet Key: SPARK-11164 URL: https://issues.apache.org/jira/browse/SPARK-11164 Project: Spark Issue Type:

[jira] [Created] (SPARK-11055) Use mixing hash-based and sort-based aggregation in TungstenAggregationIterator

2015-10-11 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-11055: --- Summary: Use mixing hash-based and sort-based aggregation in TungstenAggregationIterator Key: SPARK-11055 URL: https://issues.apache.org/jira/browse/SPARK-11055

[jira] [Commented] (SPARK-10968) Incorrect Join behavior in filter conditions

2015-10-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946358#comment-14946358 ] Liang-Chi Hsieh commented on SPARK-10968: - Is it incorrect? Because d5.value and d5_opp.value are

[jira] [Closed] (SPARK-10841) Add pushdown support of UDF for parquet

2015-10-07 Thread Liang-Chi Hsieh (JIRA)
uet > --- > > Key: SPARK-10841 > URL: https://issues.apache.org/jira/browse/SPARK-10841 > Project: Spark > Issue Type: New Feature > Components: SQL > Reporter: Liang-Chi Hsieh > > JIRA

[jira] [Commented] (SPARK-10909) Spark sql jdbc fails for Oracle NUMBER type columns

2015-10-02 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941230#comment-14941230 ] Liang-Chi Hsieh commented on SPARK-10909: - [~p02096] I created a pr for this problem. Since it is

[jira] [Created] (SPARK-10895) Add pushdown string filters for Parquet

2015-10-01 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-10895: --- Summary: Add pushdown string filters for Parquet Key: SPARK-10895 URL: https://issues.apache.org/jira/browse/SPARK-10895 Project: Spark Issue Type:

[jira] [Created] (SPARK-10841) Add pushdown support of UDF for parquet

2015-09-26 Thread Liang-Chi Hsieh (JIRA)
: New Feature Components: SQL Reporter: Liang-Chi Hsieh JIRA: Currently we can't push down filters involving UDFs to Parquet. In practice, we have some usage of UDFs in filters, e.g., SELECT * FROM table WHERE udf(customer_id) = "ABC" In above query, `c

[jira] [Commented] (SPARK-8386) DataFrame and JDBC regression

2015-09-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905940#comment-14905940 ] Liang-Chi Hsieh commented on SPARK-8386: [~phaumer] I can't reproduce this problem. Can you give

[jira] [Commented] (SPARK-8386) DataFrame and JDBC regression

2015-09-22 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902294#comment-14902294 ] Liang-Chi Hsieh commented on SPARK-8386: OK. I will investigate this. > DataFrame and JDBC

[jira] [Commented] (SPARK-10578) pyspark.ml.classification.RandomForestClassifer does not return `rawPrediction` column

2015-09-14 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14743029#comment-14743029 ] Liang-Chi Hsieh commented on SPARK-10578: - Hi Karen, I think these columns are added to

[jira] [Created] (SPARK-10446) Support to specify join type when calling join with usingColumns

2015-09-04 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-10446: --- Summary: Support to specify join type when calling join with usingColumns Key: SPARK-10446 URL: https://issues.apache.org/jira/browse/SPARK-10446 Project:

[jira] [Created] (SPARK-10081) Skip re-computing getMissingParentStages in DAGScheduler

2015-08-18 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-10081: --- Summary: Skip re-computing getMissingParentStages in DAGScheduler Key: SPARK-10081 URL: https://issues.apache.org/jira/browse/SPARK-10081 Project: Spark

[jira] [Created] (SPARK-10031) Join two UnsafeRows in SortMergeJoin if possible

2015-08-16 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-10031: --- Summary: Join two UnsafeRows in SortMergeJoin if possible Key: SPARK-10031 URL: https://issues.apache.org/jira/browse/SPARK-10031 Project: Spark Issue

[jira] [Closed] (SPARK-9637) Add interface for implementing scheduling algorithm for standalone deployment

2015-08-14 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-9637. -- Resolution: Duplicate Add interface for implementing scheduling algorithm for standalone

[jira] [Commented] (SPARK-9936) decimal precision lost when loading DataFrame from RDD

2015-08-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695274#comment-14695274 ] Liang-Chi Hsieh commented on SPARK-9936: I think this problem is solved in current

[jira] [Created] (SPARK-9882) Priority-based scheduling for Spark applications

2015-08-12 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-9882: -- Summary: Priority-based scheduling for Spark applications Key: SPARK-9882 URL: https://issues.apache.org/jira/browse/SPARK-9882 Project: Spark Issue

[jira] [Created] (SPARK-9637) Add interface for implementing scheduling algorithm for standalone deployment

2015-08-05 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-9637: -- Summary: Add interface for implementing scheduling algorithm for standalone deployment Key: SPARK-9637 URL: https://issues.apache.org/jira/browse/SPARK-9637

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647630#comment-14647630 ] Liang-Chi Hsieh commented on SPARK-9347: It will merge different schema if the

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647673#comment-14647673 ] Liang-Chi Hsieh commented on SPARK-9347: Actually the newly introduced

[jira] [Commented] (SPARK-9362) Exception when using DataFrame groupby().sum on Decimal type in Python

2015-07-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647338#comment-14647338 ] Liang-Chi Hsieh commented on SPARK-9362: As I just test, yes. I think this ticket

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647134#comment-14647134 ] Liang-Chi Hsieh commented on SPARK-9347: OK. The latest development is, we will

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647136#comment-14647136 ] Liang-Chi Hsieh commented on SPARK-9347: You concern should be solved in the

[jira] [Commented] (SPARK-6319) DISTINCT doesn't work for binary type

2015-07-29 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645628#comment-14645628 ] Liang-Chi Hsieh commented on SPARK-6319: [~joshrosen] I will do it later. Thanks.

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644557#comment-14644557 ] Liang-Chi Hsieh commented on SPARK-9347: Besides common metadata, I think there

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644518#comment-14644518 ] Liang-Chi Hsieh commented on SPARK-9347: Currently, as we discussed in the PR, we

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-28 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644571#comment-14644571 ] Liang-Chi Hsieh commented on SPARK-9347: Without any _metadata files in partition

[jira] [Created] (SPARK-9378) Remove failed and improper Hive test due to old ParquetRelation removed

2015-07-27 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-9378: -- Summary: Remove failed and improper Hive test due to old ParquetRelation removed Key: SPARK-9378 URL: https://issues.apache.org/jira/browse/SPARK-9378 Project:

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641911#comment-14641911 ] Liang-Chi Hsieh commented on SPARK-9347: I already tried to fix this in [this

[jira] [Commented] (SPARK-9347) spark load of existing parquet files extremely slow if large number of files

2015-07-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642030#comment-14642030 ] Liang-Chi Hsieh commented on SPARK-9347: I am not sure if it will be backported to

[jira] [Commented] (SPARK-9340) ParquetTypeConverter incorrectly handling of repeated types results in schema mismatch

2015-07-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641916#comment-14641916 ] Liang-Chi Hsieh commented on SPARK-9340: Your test will cause

[jira] [Updated] (SPARK-9361) Refactor new aggregation code to reduce the times of checking compatibility

2015-07-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-9361: --- Summary: Refactor new aggregation code to reduce the times of checking compatibility (was:

[jira] [Created] (SPARK-9361) Reduce the times of calling aggregate.Utils.tryConvert

2015-07-26 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-9361: -- Summary: Reduce the times of calling aggregate.Utils.tryConvert Key: SPARK-9361 URL: https://issues.apache.org/jira/browse/SPARK-9361 Project: Spark

<    7   8   9   10   11   12   13   14   >