[jira] [Resolved] (SPARK-8461) ClassNotFoundException when code generation is enabled

2015-06-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-8461. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6898

[jira] [Resolved] (SPARK-8320) Add example in streaming programming guide that shows union of multiple input streams

2015-06-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-8320. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6862

[jira] [Assigned] (SPARK-8432) Fix hashCode and equals() of BinaryType in Row

2015-06-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-8432: - Assignee: Davies Liu Fix hashCode and equals() of BinaryType in Row

[jira] [Commented] (SPARK-8431) Add in operator to DataFrame Column in SparkR

2015-06-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592264#comment-14592264 ] Davies Liu commented on SPARK-8431: --- In Python, we convert the list into JavaList, then

[jira] [Commented] (SPARK-6813) SparkR style guide

2015-06-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591315#comment-14591315 ] Davies Liu commented on SPARK-6813: --- The lint tools is only used by Jenkins (or

[jira] [Created] (SPARK-8432) Fix hashCode and equals() of BinaryType in Row

2015-06-18 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8432: - Summary: Fix hashCode and equals() of BinaryType in Row Key: SPARK-8432 URL: https://issues.apache.org/jira/browse/SPARK-8432 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-6390) Add MatrixUDT in PySpark

2015-06-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-6390. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6354

[jira] [Resolved] (SPARK-7605) Python API for ElementwiseProduct

2015-06-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7605. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6346

[jira] [Resolved] (SPARK-7199) Add date and timestamp support to UnsafeRow

2015-06-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7199. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 5984

[jira] [Updated] (SPARK-8348) Add in operator to DataFrame Column

2015-06-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-8348: -- Component/s: SparkR Add in operator to DataFrame Column ---

[jira] [Created] (SPARK-8346) User InternalRow instread of catalyst.InternalRow

2015-06-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8346: - Summary: User InternalRow instread of catalyst.InternalRow Key: SPARK-8346 URL: https://issues.apache.org/jira/browse/SPARK-8346 Project: Spark Issue Type:

[jira] [Updated] (SPARK-8346) Use InternalRow instread of catalyst.InternalRow

2015-06-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-8346: -- Summary: Use InternalRow instread of catalyst.InternalRow (was: User InternalRow instread of

[jira] [Updated] (SPARK-8307) Improve timestamp from parquet

2015-06-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-8307: -- Summary: Improve timestamp from parquet (was: Improve timestamp from parquet/hive) Improve timestamp

[jira] [Created] (SPARK-8307) Improve timestamp from parquet/hive

2015-06-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8307: - Summary: Improve timestamp from parquet/hive Key: SPARK-8307 URL: https://issues.apache.org/jira/browse/SPARK-8307 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-6419) GenerateOrdering does not support BinaryType and complex types.

2015-06-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580900#comment-14580900 ] Davies Liu commented on SPARK-6419: --- Fixed by SPARK-7956 GenerateOrdering does not

[jira] [Assigned] (SPARK-7186) Decouple internal Row from external Row

2015-06-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7186: - Assignee: Davies Liu Decouple internal Row from external Row

[jira] [Resolved] (SPARK-6419) GenerateOrdering does not support BinaryType and complex types.

2015-06-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-6419. --- Resolution: Fixed Fix Version/s: 1.5.0 GenerateOrdering does not support BinaryType and

[jira] [Resolved] (SPARK-7814) Turn code generation on by default

2015-06-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7814. --- Resolution: Duplicate Fix Version/s: 1.5.0 Turn code generation on by default

[jira] [Created] (SPARK-8305) Improve codegen

2015-06-10 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8305: - Summary: Improve codegen Key: SPARK-8305 URL: https://issues.apache.org/jira/browse/SPARK-8305 Project: Spark Issue Type: Bug Components: SQL

[jira] [Created] (SPARK-8202) PySpark: infinite loop during external sort

2015-06-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8202: - Summary: PySpark: infinite loop during external sort Key: SPARK-8202 URL: https://issues.apache.org/jira/browse/SPARK-8202 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-8202) PySpark: infinite loop during external sort

2015-06-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578425#comment-14578425 ] Davies Liu commented on SPARK-8202: --- Workaround: increase the number of partitions

[jira] [Commented] (SPARK-8144) PySpark SQL readwriter options() does not work

2015-06-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576081#comment-14576081 ] Davies Liu commented on SPARK-8144: --- We could do something, but it will be tricky,

[jira] [Commented] (SPARK-8071) Run PySpark dataframe.rollup/cube test failed

2015-06-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575877#comment-14575877 ] Davies Liu commented on SPARK-8071: --- I think It may be related to JDK 8 (we didn't test

[jira] [Commented] (SPARK-8144) PySpark SQL readwriter options() does not work

2015-06-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575875#comment-14575875 ] Davies Liu commented on SPARK-8144: --- [~josephkb] It's not a typo, it's correct. Could

[jira] [Commented] (SPARK-8071) Run PySpark dataframe.rollup/cube test failed

2015-06-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573771#comment-14573771 ] Davies Liu commented on SPARK-8071: --- It failed in Scala side, cc [~rxin] Run PySpark

[jira] [Assigned] (SPARK-6419) GenerateOrdering does not support BinaryType and complex types.

2015-06-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-6419: - Assignee: Davies Liu GenerateOrdering does not support BinaryType and complex types.

[jira] [Created] (SPARK-8117) Push codegen into Expression

2015-06-04 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8117: - Summary: Push codegen into Expression Key: SPARK-8117 URL: https://issues.apache.org/jira/browse/SPARK-8117 Project: Spark Issue Type: Bug Components:

[jira] [Assigned] (SPARK-7184) Investigate turning codegen on by default

2015-06-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7184: - Assignee: Davies Liu Investigate turning codegen on by default

[jira] [Resolved] (SPARK-7956) Use Janino to compile SQL expression

2015-06-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7956. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6479

[jira] [Created] (SPARK-8070) Improve createDataFrame in Python

2015-06-03 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8070: - Summary: Improve createDataFrame in Python Key: SPARK-8070 URL: https://issues.apache.org/jira/browse/SPARK-8070 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-7899) PySpark sql/tests breaks pylint validation

2015-06-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7899: -- Fix Version/s: 1.4.0 PySpark sql/tests breaks pylint validation

[jira] [Updated] (SPARK-7899) PySpark sql/tests breaks pylint validation

2015-06-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7899: -- Fix Version/s: (was: 1.5.0) PySpark sql/tests breaks pylint validation

[jira] [Commented] (SPARK-7899) PySpark sql/tests breaks pylint validation

2015-06-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569458#comment-14569458 ] Davies Liu commented on SPARK-7899: --- I had done it yesterday. PySpark sql/tests breaks

[jira] [Assigned] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-06-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-6917: - Assignee: Davies Liu (was: Yin Huai) Broken data returned to PySpark dataframe if any large

[jira] [Created] (SPARK-7978) DecimalType should not be singleton

2015-05-30 Thread Davies Liu (JIRA)
Davies Liu created SPARK-7978: - Summary: DecimalType should not be singleton Key: SPARK-7978 URL: https://issues.apache.org/jira/browse/SPARK-7978 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-7954) Implicitly create SparkContext in sparkRSQL.init

2015-05-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7954. --- Resolution: Fixed Fix Version/s: 1.4.1 1.6.0 Issue resolved by pull request

[jira] [Updated] (SPARK-7954) Implicitly create SparkContext in sparkRSQL.init

2015-05-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7954: -- Fix Version/s: (was: 1.6.0) 1.5.0 Implicitly create SparkContext in

[jira] [Resolved] (SPARK-7899) PySpark sql/tests breaks pylint validation

2015-05-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7899. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6439

[jira] [Created] (SPARK-7956) Use Janino to compile SQL expression

2015-05-29 Thread Davies Liu (JIRA)
Davies Liu created SPARK-7956: - Summary: Use Janino to compile SQL expression Key: SPARK-7956 URL: https://issues.apache.org/jira/browse/SPARK-7956 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-7956) Use Janino to compile SQL expression

2015-05-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7956: -- Description: The overhead of current implementation of codegen is to high (100ms - 500ms), which

[jira] [Updated] (SPARK-7956) Use Janino to compile SQL expression

2015-05-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7956: -- Description: The overhead of current implementation of codegen is to high (50ms - 500ms), which blocks

[jira] [Commented] (SPARK-7909) spark-ec2 and associated tools not py3 ready

2015-05-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563352#comment-14563352 ] Davies Liu commented on SPARK-7909: --- [~meawoppl] It's true that some tools don't work

[jira] [Resolved] (SPARK-7908) PySpark Streaming tests are flaky.

2015-05-27 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7908. --- Resolution: Duplicate PySpark Streaming tests are flaky. --

[jira] [Updated] (SPARK-7806) spark-ec2 launch script fails for Python3

2015-05-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7806: -- Assignee: (was: Davies Liu) spark-ec2 launch script fails for Python3

[jira] [Assigned] (SPARK-7806) spark-ec2 launch script fails for Python3

2015-05-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7806: - Assignee: Davies Liu spark-ec2 launch script fails for Python3

[jira] [Resolved] (SPARK-7339) PySpark shuffle spill memory sometimes are not correct

2015-05-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7339. --- Resolution: Fixed Fix Version/s: 1.4.0 PySpark shuffle spill memory sometimes are not correct

[jira] [Updated] (SPARK-7339) PySpark shuffle spill memory sometimes are not correct

2015-05-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7339: -- Assignee: Weizhong PySpark shuffle spill memory sometimes are not correct

[jira] [Resolved] (SPARK-5090) The improvement of python converter for hbase

2015-05-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-5090. --- Resolution: Fixed Fix Version/s: 1.5.0 Assignee: Gen TANG The improvement of python

[jira] [Created] (SPARK-7840) Movie Python DataFrame.insertInto into DataFrameWriter

2015-05-23 Thread Davies Liu (JIRA)
Davies Liu created SPARK-7840: - Summary: Movie Python DataFrame.insertInto into DataFrameWriter Key: SPARK-7840 URL: https://issues.apache.org/jira/browse/SPARK-7840 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-7840) Move Python DataFrame.insertInto into DataFrameWriter

2015-05-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7840. --- Resolution: Fixed Fix Version/s: 1.4.0 Move Python DataFrame.insertInto into DataFrameWriter

[jira] [Updated] (SPARK-7840) Move Python DataFrame.insertInto into DataFrameWriter

2015-05-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-7840: -- Summary: Move Python DataFrame.insertInto into DataFrameWriter (was: Movie Python DataFrame.insertInto

[jira] [Created] (SPARK-7836) DataFrame.ntile() should only accept Int as parameter

2015-05-22 Thread Davies Liu (JIRA)
Davies Liu created SPARK-7836: - Summary: DataFrame.ntile() should only accept Int as parameter Key: SPARK-7836 URL: https://issues.apache.org/jira/browse/SPARK-7836 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-7624) Task scheduler delay is increasing time over time in spark local mode

2015-05-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7624. --- Resolution: Fixed Fix Version/s: 1.3.2 Task scheduler delay is increasing time over time in

[jira] [Commented] (SPARK-7624) Task scheduler delay is increasing time over time in spark local mode

2015-05-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556962#comment-14556962 ] Davies Liu commented on SPARK-7624: --- Had merged into 1.3 branch. Task scheduler delay

[jira] [Assigned] (SPARK-7822) Window function support in Python DataFrame DSL

2015-05-22 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7822: - Assignee: Davies Liu Window function support in Python DataFrame DSL

[jira] [Commented] (SPARK-6764) Add wheel package support for PySpark

2015-05-21 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554825#comment-14554825 ] Davies Liu commented on SPARK-6764: --- My first question is that, can we use wheel package

[jira] [Commented] (SPARK-6289) PySpark doesn't maintain SQL date Types

2015-05-21 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555286#comment-14555286 ] Davies Liu commented on SPARK-6289: --- This will be fixed by upgrading to Pyrolite 4.6,

[jira] [Commented] (SPARK-7565) Broken maps in jsonRDD

2015-05-20 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553216#comment-14553216 ] Davies Liu commented on SPARK-7565: --- [~tailhook] The patch is kind of workaround, it

[jira] [Assigned] (SPARK-7565) Broken maps in jsonRDD

2015-05-20 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7565: - Assignee: Davies Liu Broken maps in jsonRDD -- Key:

[jira] [Assigned] (SPARK-7606) Document all PySpark SQL/DataFrame public methods with @since tag

2015-05-20 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7606: - Assignee: Davies Liu Document all PySpark SQL/DataFrame public methods with @since tag

[jira] [Assigned] (SPARK-7783) Add rollup and cube support to DataFrame Python DSL

2015-05-20 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7783: - Assignee: Davies Liu (was: Cheng Hao) Add rollup and cube support to DataFrame Python DSL

[jira] [Created] (SPARK-7738) DataFramer reader/writer API in Python

2015-05-19 Thread Davies Liu (JIRA)
Davies Liu created SPARK-7738: - Summary: DataFramer reader/writer API in Python Key: SPARK-7738 URL: https://issues.apache.org/jira/browse/SPARK-7738 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-7721) Generate test coverage report from Python

2015-05-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551198#comment-14551198 ] Davies Liu commented on SPARK-7721: --- There are some tools to generate test coverage for

[jira] [Comment Edited] (SPARK-7688) PySpark + ipython throws port out of range exception

2015-05-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551187#comment-14551187 ] Davies Liu edited comment on SPARK-7688 at 5/19/15 8:45 PM:

[jira] [Commented] (SPARK-7688) PySpark + ipython throws port out of range exception

2015-05-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551187#comment-14551187 ] Davies Liu commented on SPARK-7688: --- @mengxr, could you work around it, or should be

[jira] [Commented] (SPARK-7606) Document all PySpark SQL/DataFrame public methods with @since tag

2015-05-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547079#comment-14547079 ] Davies Liu commented on SPARK-7606: --- +1 for `versionadded` Document all PySpark

[jira] [Assigned] (SPARK-6785) DateUtils can not handle date before 1970/01/01 correctly

2015-05-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-6785: - Assignee: Davies Liu DateUtils can not handle date before 1970/01/01 correctly

[jira] [Updated] (SPARK-6785) DateUtils can not handle date before 1970/01/01 correctly

2015-05-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6785: -- Assignee: (was: Christian Tzolov) DateUtils can not handle date before 1970/01/01 correctly

[jira] [Updated] (SPARK-6785) DateUtils can not handle date before 1970/01/01 correctly

2015-05-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6785: -- Assignee: Christian Tzolov (was: Davies Liu) DateUtils can not handle date before 1970/01/01

[jira] [Commented] (SPARK-7688) PySpark + ipython throws port out of range exception

2015-05-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547313#comment-14547313 ] Davies Liu commented on SPARK-7688: --- It runs fine in my Mac, could you try this? {code}

[jira] [Commented] (SPARK-6902) Row() object can be mutated even though it should be immutable

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546274#comment-14546274 ] Davies Liu commented on SPARK-6902: --- [~jarfa] Python is a dynamic language, it's not

[jira] [Commented] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546463#comment-14546463 ] Davies Liu commented on SPARK-6411: --- Since TimestampType in Spark SQL does not support

[jira] [Comment Edited] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546463#comment-14546463 ] Davies Liu edited comment on SPARK-6411 at 5/16/15 1:02 AM:

[jira] [Updated] (SPARK-6806) SparkR examples in programming guide

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6806: -- Priority: Critical (was: Blocker) SparkR examples in programming guide

[jira] [Updated] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6917: -- Priority: Critical (was: Major) Broken data returned to PySpark dataframe if any large numbers used

[jira] [Updated] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6917: -- Assignee: Yin Huai (was: Davies Liu) Broken data returned to PySpark dataframe if any large numbers

[jira] [Commented] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546160#comment-14546160 ] Davies Liu commented on SPARK-6917: --- [~yhuai] It's a bug in SQL or Parquet library:

[jira] [Comment Edited] (SPARK-6917) Broken data returned to PySpark dataframe if any large numbers used in Scala land

2015-05-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546160#comment-14546160 ] Davies Liu edited comment on SPARK-6917 at 5/15/15 8:58 PM:

[jira] [Commented] (SPARK-7624) Task scheduler delay is increasing time over time in spark local mode

2015-05-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544978#comment-14544978 ] Davies Liu commented on SPARK-7624: --- In the context of Spark Streaming, there could be

[jira] [Updated] (SPARK-6289) PySpark doesn't maintain SQL date Types

2015-05-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6289: -- Description: For the DateType, Spark SQL requires a datetime.date in Python. However, if you collect a

[jira] [Commented] (SPARK-6289) PySpark doesn't maintain SQL date Types

2015-05-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544652#comment-14544652 ] Davies Liu commented on SPARK-6289: --- [~mnazario] datetime is a subclass of date, so a

[jira] [Assigned] (SPARK-7624) Task scheduler delay is increasing time over time in spark local mode

2015-05-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-7624: - Assignee: Davies Liu Task scheduler delay is increasing time over time in spark local mode

[jira] [Commented] (SPARK-7624) Task scheduler delay is increasing time over time in spark local mode

2015-05-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544950#comment-14544950 ] Davies Liu commented on SPARK-7624: --- This is introduced by

[jira] [Commented] (SPARK-6289) PySpark doesn't maintain SQL date Types

2015-05-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541235#comment-14541235 ] Davies Liu commented on SPARK-6289: --- [~mnazario] Is this still a problem after we

[jira] [Updated] (SPARK-6949) Support Date/Timestamp in Column expression of DataFrame Python API

2015-04-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-6949: -- Summary: Support Date/Timestamp in Column expression of DataFrame Python API (was: Support Date and

[jira] [Assigned] (SPARK-6949) Support Date and Decimal in Column expression of DataFrame Python API

2015-04-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-6949: - Assignee: Davies Liu Support Date and Decimal in Column expression of DataFrame Python API

[jira] [Closed] (SPARK-6668) repeated asking to remove non-existent executor

2015-04-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-6668. - Resolution: Duplicate Fix Version/s: 1.4.0 repeated asking to remove non-existent executor

[jira] [Updated] (SPARK-4897) Python 3 support

2015-04-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-4897: -- Priority: Blocker (was: Minor) Python 3 support Key: SPARK-4897

[jira] [Commented] (SPARK-6857) Python SQL schema inference should support numpy types

2015-04-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498593#comment-14498593 ] Davies Liu commented on SPARK-6857: --- It's not good that we use array or numpy.array as

[jira] [Reopened] (SPARK-6216) Check Python version in worker before run PySpark job

2015-04-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reopened SPARK-6216: --- This merged patch does not work well if you have different major version on driver or worker. Check

[jira] [Commented] (SPARK-6857) Python SQL schema inference should support numpy types

2015-04-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498414#comment-14498414 ] Davies Liu commented on SPARK-6857: --- [~josephkb] Because the serializer do not support

[jira] [Created] (SPARK-6953) Speedup tests of PySpark, reduce logging

2015-04-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6953: - Summary: Speedup tests of PySpark, reduce logging Key: SPARK-6953 URL: https://issues.apache.org/jira/browse/SPARK-6953 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-6911) API for access MapType in DataFrame

2015-04-14 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6911: - Summary: API for access MapType in DataFrame Key: SPARK-6911 URL: https://issues.apache.org/jira/browse/SPARK-6911 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-4897) Python 3 support

2015-04-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495443#comment-14495443 ] Davies Liu commented on SPARK-4897: --- That PR is pretty close to merge, we are targeting

[jira] [Created] (SPARK-6886) Big closure in PySpark will fail during shuffle

2015-04-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6886: - Summary: Big closure in PySpark will fail during shuffle Key: SPARK-6886 URL: https://issues.apache.org/jira/browse/SPARK-6886 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-6890) Local cluster mode in Mac is broken

2015-04-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6890: - Summary: Local cluster mode in Mac is broken Key: SPARK-6890 URL: https://issues.apache.org/jira/browse/SPARK-6890 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-6852) Accept numeric as numPartitions in SparkR

2015-04-10 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6852: - Summary: Accept numeric as numPartitions in SparkR Key: SPARK-6852 URL: https://issues.apache.org/jira/browse/SPARK-6852 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-6806) SparkR examples in programming guide

2015-04-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-6806: - Assignee: Davies Liu SparkR examples in programming guide

[jira] [Created] (SPARK-6812) filter() on DataFrame does not work as expected

2015-04-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6812: - Summary: filter() on DataFrame does not work as expected Key: SPARK-6812 URL: https://issues.apache.org/jira/browse/SPARK-6812 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-6806) SparkR examples in programming guide

2015-04-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6806: - Summary: SparkR examples in programming guide Key: SPARK-6806 URL: https://issues.apache.org/jira/browse/SPARK-6806 Project: Spark Issue Type: New Feature

<    13   14   15   16   17   18   19   20   21   >