[jira] [Closed] (SPARK-13179) pyspark row name collision 'count'

2016-04-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-13179. -- Resolution: Won't Fix > pyspark row name collision 'count' > -- > >

[jira] [Resolved] (SPARK-14491) refactor object operator framework to make it easy to eliminate serializations

2016-04-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14491. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12260

[jira] [Resolved] (SPARK-14614) Add `bround` function

2016-04-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14614. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12376

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245209#comment-15245209 ] Davies Liu commented on SPARK-13352: corrected, thanks > BlockFetch does not scale well on large

[jira] [Comment Edited] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234559#comment-15234559 ] Davies Liu edited comment on SPARK-13352 at 4/18/16 6:40 AM: - The result is

[jira] [Assigned] (SPARK-14669) Some SQL metrics is broken when whole-stage codegen enabled

2016-04-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-14669: -- Assignee: Davies Liu > Some SQL metrics is broken when whole-stage codegen enabled >

[jira] [Created] (SPARK-14669) Some SQL metrics is broken when whole-stage codegen enabled

2016-04-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14669: -- Summary: Some SQL metrics is broken when whole-stage codegen enabled Key: SPARK-14669 URL: https://issues.apache.org/jira/browse/SPARK-14669 Project: Spark

[jira] [Assigned] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-14607: -- Assignee: Davies Liu > Partition pruning is case sensitive even with HiveContext >

[jira] [Resolved] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14484. Resolution: Fixed Assignee: Davies Liu > Fail to create parquet filter if the column name

[jira] [Resolved] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14607. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12371

[jira] [Created] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14607: -- Summary: Partition pruning is case sensitive even with HiveContext Key: SPARK-14607 URL: https://issues.apache.org/jira/browse/SPARK-14607 Project: Spark Issue

[jira] [Resolved] (SPARK-14581) Improve filter push down

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14581. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12342

[jira] [Commented] (SPARK-14600) Push predicates through Expand

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239688#comment-15239688 ] Davies Liu commented on SPARK-14600: cc [~cloud_fan] > Push predicates through Expand >

[jira] [Created] (SPARK-14600) Push predicates through Expand

2016-04-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14600: -- Summary: Push predicates through Expand Key: SPARK-14600 URL: https://issues.apache.org/jira/browse/SPARK-14600 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14582) Increase the parallelism for small tables

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14582: -- Summary: Increase the parallelism for small tables Key: SPARK-14582 URL: https://issues.apache.org/jira/browse/SPARK-14582 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-14578) Can't load a json dataset with nested wide schema

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14578. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12338

[jira] [Created] (SPARK-14581) Improve filter push down

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14581: -- Summary: Improve filter push down Key: SPARK-14581 URL: https://issues.apache.org/jira/browse/SPARK-14581 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-14363) Executor OOM due to a memory leak in Sorter

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14363. Resolution: Fixed Fix Version/s: 1.6.2 2.0.0 Issue resolved by pull

[jira] [Resolved] (SPARK-14544) Spark UI is very slow in recent Chrome

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14544. Resolution: Fixed Fix Version/s: 2.0.0 > Spark UI is very slow in recent Chrome >

[jira] [Created] (SPARK-14578) Can't load a json dataset with nested wide schema

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14578: -- Summary: Can't load a json dataset with nested wide schema Key: SPARK-14578 URL: https://issues.apache.org/jira/browse/SPARK-14578 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-14562) Improve constraints propagation in Union

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14562. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12328

[jira] [Created] (SPARK-14562) Improve constraints propagation in Union

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14562: -- Summary: Improve constraints propagation in Union Key: SPARK-14562 URL: https://issues.apache.org/jira/browse/SPARK-14562 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14544) Spark UI is very slow in recent Chrome

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14544: -- Summary: Spark UI is very slow in recent Chrome Key: SPARK-14544 URL: https://issues.apache.org/jira/browse/SPARK-14544 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-14541) SQL function: IFNULL, NULLIF, NVL and NVL2

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14541: -- Summary: SQL function: IFNULL, NULLIF, NVL and NVL2 Key: SPARK-14541 URL: https://issues.apache.org/jira/browse/SPARK-14541 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-14471) The alias created in SELECT could be used in GROUP BY and followed expressions

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14471: --- Description: This query should be able to run: {code} select a a1, a1 + 1 as b, count(1) from t

[jira] [Updated] (SPARK-14471) The alias created in SELECT could be used in GROUP BY and followed expressions

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14471: --- Summary: The alias created in SELECT could be used in GROUP BY and followed expressions (was: The

[jira] [Updated] (SPARK-14538) Increase the default stack size of spark shell

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14538: --- Assignee: (was: Davies Liu) > Increase the default stack size of spark shell >

[jira] [Commented] (SPARK-14526) The catalog of SQLContext should not be case-sensitive

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235617#comment-15235617 ] Davies Liu commented on SPARK-14526: It seems like a feature for a long time. It make more sense to

[jira] [Created] (SPARK-14538) Increase the default stack size of spark shell

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14538: -- Summary: Increase the default stack size of spark shell Key: SPARK-14538 URL: https://issues.apache.org/jira/browse/SPARK-14538 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-14454) Better exception handling while marking tasks as failed

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14454: --- Fix Version/s: 1.6.2 > Better exception handling while marking tasks as failed >

[jira] [Updated] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13352: --- Fix Version/s: 1.6.2 > BlockFetch does not scale well on large block >

[jira] [Resolved] (SPARK-14502) Add optimization for Binary Comparison Simplification

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14502. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12267

[jira] [Resolved] (SPARK-14528) SameResult on Union is broken

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14528. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12295

[jira] [Updated] (SPARK-14526) The catalog of SQLContext should not be case-sensitive

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14526: --- Priority: Blocker (was: Major) > The catalog of SQLContext should not be case-sensitive >

[jira] [Resolved] (SPARK-14524) In SparkSQL, it can't be select column of String type because of UTF8String when setting more than 32G for executors.

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14524. Resolution: Duplicate Assignee: Davies Liu > In SparkSQL, it can't be select column of

[jira] [Created] (SPARK-14528) SameResult on Union is broken

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14528: -- Summary: SameResult on Union is broken Key: SPARK-14528 URL: https://issues.apache.org/jira/browse/SPARK-14528 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13352: --- Fix Version/s: (was: 1.6.2) > BlockFetch does not scale well on large block >

[jira] [Updated] (SPARK-14242) avoid too many copies in network when a network frame is large

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14242: --- Fix Version/s: 1.6.2 > avoid too many copies in network when a network frame is large >

[jira] [Comment Edited] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234559#comment-15234559 ] Davies Liu edited comment on SPARK-13352 at 4/11/16 6:35 AM: - The result is

[jira] [Resolved] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13352. Resolution: Fixed Fix Version/s: 2.0.0 1.6.2 > BlockFetch does not scale

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234559#comment-15234559 ] Davies Liu commented on SPARK-13352: The result is much better now: {code} 50M2.2 seconds

[jira] [Updated] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13352: --- Assignee: Zhang, Liye > BlockFetch does not scale well on large block >

[jira] [Resolved] (SPARK-14217) Vectorized parquet reader produces wrong result if data used dictionary encoding fallback

2016-04-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14217. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12279

[jira] [Resolved] (SPARK-14419) Improve the HashedRelation for key fit within Long

2016-04-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14419. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12190

[jira] [Resolved] (SPARK-14454) Better exception handling while marking tasks as failed

2016-04-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14454. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12234

[jira] [Resolved] (SPARK-14448) Improvements to ColumnVector

2016-04-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14448. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12225

[jira] [Created] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-08 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14484: -- Summary: Fail to create parquet filter if the column name does not match exactly Key: SPARK-14484 URL: https://issues.apache.org/jira/browse/SPARK-14484 Project: Spark

[jira] [Commented] (SPARK-8632) Poor Python UDF performance because of RDD caching

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231585#comment-15231585 ] Davies Liu commented on SPARK-8632: --- [~bijay697] Python UDFs had been improved a lot recently in master,

[jira] [Commented] (SPARK-14476) Show table name or path in string of DataSourceScan

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231354#comment-15231354 ] Davies Liu commented on SPARK-14476: cc [~lian cheng] > Show table name or path in string of

[jira] [Created] (SPARK-14476) Show table name or path in string of DataSourceScan

2016-04-07 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14476: -- Summary: Show table name or path in string of DataSourceScan Key: SPARK-14476 URL: https://issues.apache.org/jira/browse/SPARK-14476 Project: Spark Issue Type:

[jira] [Created] (SPARK-14471) The alias created in SELECT could be used in GROUP BY

2016-04-07 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14471: -- Summary: The alias created in SELECT could be used in GROUP BY Key: SPARK-14471 URL: https://issues.apache.org/jira/browse/SPARK-14471 Project: Spark Issue

[jira] [Resolved] (SPARK-12740) grouping()/grouping_id() should work with having and order by

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12740. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12235

[jira] [Resolved] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13932. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12235

[jira] [Commented] (SPARK-13842) Consider __iter__ and __getitem__ methods for pyspark.sql.types.StructType

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230799#comment-15230799 ] Davies Liu commented on SPARK-13842: Sounds good to me. > Consider __iter__ and __getitem__ methods

[jira] [Comment Edited] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230767#comment-15230767 ] Davies Liu edited comment on SPARK-13932 at 4/7/16 6:22 PM: This will be

[jira] [Commented] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230767#comment-15230767 ] Davies Liu commented on SPARK-13932: This will be fixed in https://github.com/apache/spark/pull/12235

[jira] [Resolved] (SPARK-14223) Cannot project all columns from a parquet files with ~1,100 columns

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14223. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12047

[jira] [Resolved] (SPARK-14310) Fix scan whole stage codegen to determine if batches are produced based on schema

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14310. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12047

[jira] [Resolved] (SPARK-14224) Cannot project all columns from a table with ~1,100 columns

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14224. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12047

[jira] [Assigned] (SPARK-13966) Regression using .withColumn() on a parquet

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-13966: -- Assignee: Davies Liu > Regression using .withColumn() on a parquet >

[jira] [Commented] (SPARK-13966) Regression using .withColumn() on a parquet

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229042#comment-15229042 ] Davies Liu commented on SPARK-13966: I checked this on latest master, it works, could you check this

[jira] [Updated] (SPARK-14031) Dataframe to csv IO, system performance enters high CPU state and write operation takes 1 hour to complete

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14031: --- Priority: Critical (was: Minor) > Dataframe to csv IO, system performance enters high CPU state and

[jira] [Resolved] (SPARK-13867) Failed to bind reference when cume_dist is used

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13867. Resolution: Fixed Assignee: Cheng Lian https://github.com/apache/spark/pull/12040 > Failed

[jira] [Commented] (SPARK-13346) Using DataFrames iteratively leads to massive query plans, which slows execution

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228923#comment-15228923 ] Davies Liu commented on SPARK-13346: This is known issue since the beginning of DataFrame (even Spark

[jira] [Updated] (SPARK-14317) Clean up hash join

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14317: --- Fix Version/s: 2.0.0 > Clean up hash join > -- > > Key: SPARK-14317

[jira] [Commented] (SPARK-14317) Clean up hash join

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228908#comment-15228908 ] Davies Liu commented on SPARK-14317: https://github.com/apache/spark/pull/12102 > Clean up hash join

[jira] [Resolved] (SPARK-14317) Clean up hash join

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14317. Resolution: Fixed > Clean up hash join > -- > > Key: SPARK-14317 >

[jira] [Created] (SPARK-14419) Improve the HashedRelation for key fit within Long

2016-04-05 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14419: -- Summary: Improve the HashedRelation for key fit within Long Key: SPARK-14419 URL: https://issues.apache.org/jira/browse/SPARK-14419 Project: Spark Issue Type:

[jira] [Created] (SPARK-14418) Broadcast.unpersist() in PySpark is not consistent with that in Scala

2016-04-05 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14418: -- Summary: Broadcast.unpersist() in PySpark is not consistent with that in Scala Key: SPARK-14418 URL: https://issues.apache.org/jira/browse/SPARK-14418 Project: Spark

[jira] [Resolved] (SPARK-14353) Dateset Time Windowing API for Python, R, and SQL

2016-04-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14353. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12136

[jira] [Resolved] (SPARK-14334) Add toLocalIterator for Dataset

2016-04-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14334. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12114

[jira] [Resolved] (SPARK-12981) Dataframe distinct() followed by a filter(udf) in pyspark throws a casting error

2016-04-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12981. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12127

[jira] [Updated] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-04-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14231: --- Assignee: Hyukjin Kwon > JSON data source fails to infer floats as decimal when precision is bigger

[jira] [Resolved] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-04-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14231. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12030

[jira] [Updated] (SPARK-13996) Add more not null attributes for Filter codegen

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13996: --- Assignee: Liang-Chi Hsieh > Add more not null attributes for Filter codegen >

[jira] [Resolved] (SPARK-13996) Add more not null attributes for Filter codegen

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13996. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 11810

[jira] [Updated] (SPARK-13996) Add more not null attributes for Filter codegen

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13996: --- Fix Version/s: (was: 2.1.0) 2.0.0 > Add more not null attributes for Filter

[jira] [Resolved] (SPARK-14138) Generated SpecificColumnarIterator code can exceed JVM size limit for cached DataFrames

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14138. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12108

[jira] [Updated] (SPARK-13674) Add wholestage codegen support to Sample

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13674: --- Fix Version/s: (was: 2.1.0) 2.0.0 > Add wholestage codegen support to Sample

[jira] [Updated] (SPARK-13674) Add wholestage codegen support to Sample

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13674: --- Assignee: Liang-Chi Hsieh > Add wholestage codegen support to Sample >

[jira] [Resolved] (SPARK-13674) Add wholestage codegen support to Sample

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13674. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 11517

[jira] [Created] (SPARK-14334) Add toLocalIterator for Dataset

2016-04-01 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14334: -- Summary: Add toLocalIterator for Dataset Key: SPARK-14334 URL: https://issues.apache.org/jira/browse/SPARK-14334 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1503#comment-1503 ] Davies Liu commented on SPARK-13352: cc [~adav] > BlockFetch does not scale well on large block >

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222199#comment-15222199 ] Davies Liu commented on SPARK-13352: After more investigating, it turned out that the block fetcher

[jira] [Resolved] (SPARK-14267) Execute multiple Python UDFs in single batch

2016-03-31 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14267. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12057

[jira] [Created] (SPARK-14317) Clean up hash join

2016-03-31 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14317: -- Summary: Clean up hash join Key: SPARK-14317 URL: https://issues.apache.org/jira/browse/SPARK-14317 Project: Spark Issue Type: Improvement Reporter:

[jira] [Commented] (SPARK-14230) Config the start time (jitter) for streaming jobs

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218938#comment-15218938 ] Davies Liu commented on SPARK-14230: For non-window batch, could be supported via trigger, see

[jira] [Commented] (SPARK-14141) Let user specify datatypes of pandas dataframe in toPandas()

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218909#comment-15218909 ] Davies Liu commented on SPARK-14141: toLocalIterator is better than collect, but will run partitions

[jira] [Commented] (SPARK-13820) TPC-DS Query 10 fails to compile

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218906#comment-15218906 ] Davies Liu commented on SPARK-13820: [~jfc...@us.ibm.com] How much modification have you done? about

[jira] [Commented] (SPARK-14230) Config the start time (jitter) for streaming jobs

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218657#comment-15218657 ] Davies Liu commented on SPARK-14230: This will be supported in structured streaming: see

[jira] [Created] (SPARK-14267) Execute multiple Python UDFs in single batch

2016-03-30 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14267: -- Summary: Execute multiple Python UDFs in single batch Key: SPARK-14267 URL: https://issues.apache.org/jira/browse/SPARK-14267 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-14215) Support chained Python UDF

2016-03-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14215. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12014

[jira] [Resolved] (SPARK-14210) Add timing metric for how long the query spent in scan

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14210. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12007

[jira] [Resolved] (SPARK-14202) python_full_outer_join should use generator expression instead of list comp

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14202. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11998

[jira] [Created] (SPARK-14215) Support chained Python UDF

2016-03-28 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14215: -- Summary: Support chained Python UDF Key: SPARK-14215 URL: https://issues.apache.org/jira/browse/SPARK-14215 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-14052) Build BytesToBytesMap in HashedRelation

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14052. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11870

[jira] [Resolved] (SPARK-13844) Generate better code for filters with a non-nullable column

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13844. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11684

[jira] [Resolved] (SPARK-12792) Refactor RRDD to support R UDF

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12792. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 10947

[jira] [Resolved] (SPARK-13742) Add non-iterator interface to RandomSampler

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13742. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11578

[jira] [Commented] (SPARK-14141) Let user specify datatypes of pandas dataframe in toPandas()

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213876#comment-15213876 ] Davies Liu commented on SPARK-14141: toPandas() is just an convenient way to convert a small

<    1   2   3   4   5   6   7   8   9   10   >