[jira] [Created] (SPARK-16884) Move DataSourceScanExec out of ExistingRDD.scala file

2016-08-03 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16884: -- Summary: Move DataSourceScanExec out of ExistingRDD.scala file Key: SPARK-16884 URL: https://issues.apache.org/jira/browse/SPARK-16884 Project: Spark Issue

[jira] [Created] (SPARK-16818) Exchange reuse incorrectly reuses scans over different sets of partitions

2016-07-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16818: -- Summary: Exchange reuse incorrectly reuses scans over different sets of partitions Key: SPARK-16818 URL: https://issues.apache.org/jira/browse/SPARK-16818 Project: Spark

[jira] [Created] (SPARK-16596) Refactor DataSourceScanExec to do partition discovery at execution instead of planning time

2016-07-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16596: -- Summary: Refactor DataSourceScanExec to do partition discovery at execution instead of planning time Key: SPARK-16596 URL: https://issues.apache.org/jira/browse/SPARK-16596

[jira] [Created] (SPARK-16514) RegexExtract and RegexReplace crash on non-nullable input

2016-07-12 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16514: -- Summary: RegexExtract and RegexReplace crash on non-nullable input Key: SPARK-16514 URL: https://issues.apache.org/jira/browse/SPARK-16514 Project: Spark Issue

[jira] [Created] (SPARK-16432) Empty blocks fail to serialize due to assert in ChunkedByteBuffer

2016-07-07 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16432: -- Summary: Empty blocks fail to serialize due to assert in ChunkedByteBuffer Key: SPARK-16432 URL: https://issues.apache.org/jira/browse/SPARK-16432 Project: Spark

[jira] [Updated] (SPARK-16432) Empty blocks fail to serialize due to assert in ChunkedByteBuffer

2016-07-07 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-16432: --- Component/s: Spark Core > Empty blocks fail to serialize due to assert in ChunkedByteBuffer >

[jira] [Created] (SPARK-16238) Metrics for generated method bytecode size

2016-06-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16238: -- Summary: Metrics for generated method bytecode size Key: SPARK-16238 URL: https://issues.apache.org/jira/browse/SPARK-16238 Project: Spark Issue Type:

[jira] [Created] (SPARK-16025) Document OFF_HEAP storage level in 2.0

2016-06-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16025: -- Summary: Document OFF_HEAP storage level in 2.0 Key: SPARK-16025 URL: https://issues.apache.org/jira/browse/SPARK-16025 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-16021) Zero out freed memory in test to help catch correctness bugs

2016-06-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16021: -- Summary: Zero out freed memory in test to help catch correctness bugs Key: SPARK-16021 URL: https://issues.apache.org/jira/browse/SPARK-16021 Project: Spark

[jira] [Created] (SPARK-15881) Update microbenchmark results

2016-06-10 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15881: -- Summary: Update microbenchmark results Key: SPARK-15881 URL: https://issues.apache.org/jira/browse/SPARK-15881 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-15860) Metrics for codegen size and perf

2016-06-09 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15860: -- Summary: Metrics for codegen size and perf Key: SPARK-15860 URL: https://issues.apache.org/jira/browse/SPARK-15860 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-15794) Should truncate toString() of very wide schemas

2016-06-06 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15794: -- Summary: Should truncate toString() of very wide schemas Key: SPARK-15794 URL: https://issues.apache.org/jira/browse/SPARK-15794 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-15735) Allow specifying min time to run in microbenchmarks

2016-06-02 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15735: -- Summary: Allow specifying min time to run in microbenchmarks Key: SPARK-15735 URL: https://issues.apache.org/jira/browse/SPARK-15735 Project: Spark Issue Type:

[jira] [Created] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15724: -- Summary: Add benchmarks for performance over wide schemas Key: SPARK-15724 URL: https://issues.apache.org/jira/browse/SPARK-15724 Project: Spark Issue Type:

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Affects Version/s: 2.0.0 > Add benchmarks for performance over wide schemas >

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Description: There are some reported degradations in 2.0 when querying over very wide/nested

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Description: There are some reported degradations in 2.0 when querying over very wide / deeply

[jira] [Updated] (SPARK-15724) Add benchmarks for performance over wide schemas

2016-06-01 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15724: --- Component/s: SQL > Add benchmarks for performance over wide schemas >

[jira] [Comment Edited] (SPARK-15634) SQL repl is bricked if a function is registered with a non-existent jar

2016-05-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304872#comment-15304872 ] Eric Liang edited comment on SPARK-15634 at 5/27/16 9:57 PM: - Note that

[jira] [Commented] (SPARK-15634) SQL repl is bricked if a function is registered with a non-existent jar

2016-05-27 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304872#comment-15304872 ] Eric Liang commented on SPARK-15634: Note that adding jars in the repl also doesn't work currently,

[jira] [Created] (SPARK-15634) SQL repl is bricked if a function is registered with a non-existent jar

2016-05-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15634: -- Summary: SQL repl is bricked if a function is registered with a non-existent jar Key: SPARK-15634 URL: https://issues.apache.org/jira/browse/SPARK-15634 Project: Spark

[jira] [Created] (SPARK-15520) SparkSession builder in python should also allow overriding confs of existing sessions

2016-05-24 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15520: -- Summary: SparkSession builder in python should also allow overriding confs of existing sessions Key: SPARK-15520 URL: https://issues.apache.org/jira/browse/SPARK-15520

[jira] [Updated] (SPARK-15520) SparkSession builder in python should also allow overriding confs of existing sessions

2016-05-24 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15520: --- Component/s: SQL > SparkSession builder in python should also allow overriding confs of existing >

[jira] [Resolved] (SPARK-15496) Spill metrics not updated when off-heap memory is enabled

2016-05-23 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang resolved SPARK-15496. Resolution: Fixed Fix Version/s: 2.0.0 Ah, actually the reproduction was incorrect. This

[jira] [Created] (SPARK-15496) Spill metrics not updated when off-heap memory is enabled

2016-05-23 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15496: -- Summary: Spill metrics not updated when off-heap memory is enabled Key: SPARK-15496 URL: https://issues.apache.org/jira/browse/SPARK-15496 Project: Spark Issue

[jira] [Created] (SPARK-15259) Sort time metric

2016-05-10 Thread Eric Liang (JIRA)
Eric Liang created SPARK-15259: -- Summary: Sort time metric Key: SPARK-15259 URL: https://issues.apache.org/jira/browse/SPARK-15259 Project: Spark Issue Type: Bug Components: SQL

[jira] [Updated] (SPARK-15259) Sort time metric should not include spill and record insertion time

2016-05-10 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-15259: --- Summary: Sort time metric should not include spill and record insertion time (was: Sort time

[jira] [Created] (SPARK-14851) Support radix sort with nullable longs

2016-04-22 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14851: -- Summary: Support radix sort with nullable longs Key: SPARK-14851 URL: https://issues.apache.org/jira/browse/SPARK-14851 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-14790) Scalastyle should run on compile in sbt

2016-04-20 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251171#comment-15251171 ] Eric Liang commented on SPARK-14790: You can cache the style results so it's not that different >

[jira] [Created] (SPARK-14790) Scalastyle should run on compile in sbt

2016-04-20 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14790: -- Summary: Scalastyle should run on compile in sbt Key: SPARK-14790 URL: https://issues.apache.org/jira/browse/SPARK-14790 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14733) Allow custom timing control in microbenchmarks

2016-04-19 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14733: -- Summary: Allow custom timing control in microbenchmarks Key: SPARK-14733 URL: https://issues.apache.org/jira/browse/SPARK-14733 Project: Spark Issue Type:

[jira] [Created] (SPARK-14724) Improve performance of sorting by using radix sort when possible

2016-04-18 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14724: -- Summary: Improve performance of sorting by using radix sort when possible Key: SPARK-14724 URL: https://issues.apache.org/jira/browse/SPARK-14724 Project: Spark

[jira] [Updated] (SPARK-14724) Improve performance of sorting by using radix sort when possible

2016-04-18 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-14724: --- Component/s: Spark Core > Improve performance of sorting by using radix sort when possible >

[jira] [Commented] (SPARK-14475) Propagate user-defined context from driver to executors

2016-04-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232455#comment-15232455 ] Eric Liang commented on SPARK-14475: I think the main difference is that this is transparent to the

[jira] [Created] (SPARK-14475) Propagate user-defined context from driver to executors

2016-04-07 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14475: -- Summary: Propagate user-defined context from driver to executors Key: SPARK-14475 URL: https://issues.apache.org/jira/browse/SPARK-14475 Project: Spark Issue

[jira] [Commented] (SPARK-14252) Executors do not try to download remote cached blocks

2016-04-05 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227493#comment-15227493 ] Eric Liang commented on SPARK-14252: I'm going to take a look at fixing this > Executors do not try

[jira] [Commented] (SPARK-14359) Improve user experience for typed aggregate functions in Java

2016-04-04 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15224476#comment-15224476 ] Eric Liang commented on SPARK-14359: Sure > Improve user experience for typed aggregate functions in

[jira] [Created] (SPARK-14227) [SQL] Add method for printing out generated code for debugging

2016-03-28 Thread Eric Liang (JIRA)
Eric Liang created SPARK-14227: -- Summary: [SQL] Add method for printing out generated code for debugging Key: SPARK-14227 URL: https://issues.apache.org/jira/browse/SPARK-14227 Project: Spark

[jira] [Created] (SPARK-12346) GLM summary crashes with NoSuchElementException if attributes are missing names

2015-12-15 Thread Eric Liang (JIRA)
Eric Liang created SPARK-12346: -- Summary: GLM summary crashes with NoSuchElementException if attributes are missing names Key: SPARK-12346 URL: https://issues.apache.org/jira/browse/SPARK-12346 Project:

[jira] [Commented] (SPARK-11965) Update user guide for RFormula feature interactions

2015-12-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047525#comment-15047525 ] Eric Liang commented on SPARK-11965: Will do On Tue, Dec 8, 2015, 1:11 PM Joseph K. Bradley (JIRA)

[jira] [Issue Comment Deleted] (SPARK-11965) Update user guide for RFormula feature interactions

2015-12-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-11965: --- Comment: was deleted (was: Will do On Tue, Dec 8, 2015, 1:11 PM Joseph K. Bradley (JIRA)

[jira] [Commented] (SPARK-10523) SparkR formula syntax to turn strings/factors into numerics

2015-09-09 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737960#comment-14737960 ] Eric Liang commented on SPARK-10523: We can convert to boolean easily enough, but supporting >2

[jira] [Commented] (SPARK-9895) User Guide for RFormula Feature Transformer

2015-08-12 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694371#comment-14694371 ] Eric Liang commented on SPARK-9895: --- Sure, I can take this task. User Guide for

[jira] [Created] (SPARK-9681) Support R feature interactions in RFormula

2015-08-06 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9681: - Summary: Support R feature interactions in RFormula Key: SPARK-9681 URL: https://issues.apache.org/jira/browse/SPARK-9681 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-9713) Document SparkR MLlib glm() integration in Spark 1.5

2015-08-06 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9713: - Summary: Document SparkR MLlib glm() integration in Spark 1.5 Key: SPARK-9713 URL: https://issues.apache.org/jira/browse/SPARK-9713 Project: Spark Issue Type:

[jira] [Created] (SPARK-9492) LogisticRegression should provide model statistics

2015-07-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9492: - Summary: LogisticRegression should provide model statistics Key: SPARK-9492 URL: https://issues.apache.org/jira/browse/SPARK-9492 Project: Spark Issue Type:

[jira] [Created] (SPARK-9463) Expose model coefficients with names in SparkR RFormula

2015-07-29 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9463: - Summary: Expose model coefficients with names in SparkR RFormula Key: SPARK-9463 URL: https://issues.apache.org/jira/browse/SPARK-9463 Project: Spark Issue Type:

[jira] [Created] (SPARK-9391) Support minus, dot, and intercept operators in SparkR RFormula

2015-07-27 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9391: - Summary: Support minus, dot, and intercept operators in SparkR RFormula Key: SPARK-9391 URL: https://issues.apache.org/jira/browse/SPARK-9391 Project: Spark

[jira] [Created] (SPARK-9230) SparkR RFormula should support StringType features

2015-07-21 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9230: - Summary: SparkR RFormula should support StringType features Key: SPARK-9230 URL: https://issues.apache.org/jira/browse/SPARK-9230 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-9230) SparkR RFormula should support StringType features

2015-07-21 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635790#comment-14635790 ] Eric Liang commented on SPARK-9230: --- Hmm, I think it would be hard to support that in a

[jira] [Created] (SPARK-9201) Integrate MLlib with SparkR using RFormula

2015-07-20 Thread Eric Liang (JIRA)
Eric Liang created SPARK-9201: - Summary: Integrate MLlib with SparkR using RFormula Key: SPARK-9201 URL: https://issues.apache.org/jira/browse/SPARK-9201 Project: Spark Issue Type: New Feature

[jira] [Closed] (SPARK-3349) Incorrect partitioning after LIMIT operator

2014-09-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang closed SPARK-3349. - Incorrect partitioning after LIMIT operator ---

[jira] [Closed] (SPARK-3394) TakeOrdered crashes when limit is 0

2014-09-08 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang closed SPARK-3394. - TakeOrdered crashes when limit is 0 --- Key: SPARK-3394

[jira] [Created] (SPARK-3394) TakeOrdered crashes when limit is 0

2014-09-03 Thread Eric Liang (JIRA)
Eric Liang created SPARK-3394: - Summary: TakeOrdered crashes when limit is 0 Key: SPARK-3394 URL: https://issues.apache.org/jira/browse/SPARK-3394 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-3349) [Spark SQL] Incorrect partitioning after LIMIT operator

2014-09-02 Thread Eric Liang (JIRA)
Eric Liang created SPARK-3349: - Summary: [Spark SQL] Incorrect partitioning after LIMIT operator Key: SPARK-3349 URL: https://issues.apache.org/jira/browse/SPARK-3349 Project: Spark Issue Type:

<    1   2