[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070618#comment-16070618 ] Li Jin edited comment on SPARK-21190 at 6/30/17 7:36 PM: - I have

[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070618#comment-16070618 ] Li Jin edited comment on SPARK-21190 at 6/30/17 7:34 PM: - I have

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070618#comment-16070618 ] Li Jin commented on SPARK-21190: I have some APIs design written down here: Here is how t

[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-26 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063873#comment-16063873 ] Li Jin edited comment on SPARK-21190 at 6/26/17 10:02 PM: -- [~r..

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-26 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063873#comment-16063873 ] Li Jin commented on SPARK-21190: [~r...@databricks.com], The use case of seeing entire p

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-26 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063103#comment-16063103 ] Li Jin commented on SPARK-21190: Very excited to see this. I created https://issues.apa

[jira] [Commented] (SPARK-20396) Add support for pandas udf in pyspark

2017-04-20 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977236#comment-15977236 ] Li Jin commented on SPARK-20396: I am currently working on this. I'll keep updating statu

[jira] [Created] (SPARK-20396) Add support for pandas udf in pyspark

2017-04-19 Thread Li Jin (JIRA)
Li Jin created SPARK-20396: -- Summary: Add support for pandas udf in pyspark Key: SPARK-20396 URL: https://issues.apache.org/jira/browse/SPARK-20396 Project: Spark Issue Type: New Feature C

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-04-04 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956030#comment-15956030 ] Li Jin commented on SPARK-20144: > When you save the sorted data into Parquet, only the d

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-31 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951202#comment-15951202 ] Li Jin commented on SPARK-20144: Thanks Sean! I appreciate your time and help very much.

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-31 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951084#comment-15951084 ] Li Jin commented on SPARK-20144: Also, I am not sure about "If the data were sorted, sort

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-31 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951073#comment-15951073 ] Li Jin commented on SPARK-20144: I totally agree Correctness takes precedence. If sorting

[jira] [Comment Edited] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-31 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950979#comment-15950979 ] Li Jin edited comment on SPARK-20144 at 3/31/17 2:14 PM: - Thanks

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-31 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950979#comment-15950979 ] Li Jin commented on SPARK-20144: Thanks for getting back to me. Sorting in this case wil

[jira] [Commented] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950221#comment-15950221 ] Li Jin commented on SPARK-20144: Ping, anyone? This is a pretty big blocker for us. > sp

[jira] [Updated] (SPARK-20144) spark.read.parquet no long maintains ordering of the data

2017-03-29 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-20144: --- Summary: spark.read.parquet no long maintains ordering of the data (was: spark.read.parquet no long maintain

[jira] [Updated] (SPARK-20144) spark.read.parquet no long maintains the ordering the the data

2017-03-29 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-20144: --- Description: Hi, We are trying to upgrade Spark from 1.6.3 to 2.0.2. One issue we found is when we read parq

[jira] [Updated] (SPARK-20144) spark.read.parquet no long maintains the ordering the the data

2017-03-29 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jin updated SPARK-20144: --- Description: Hi, We are trying to upgrade Spark from 1.6.3 to 2.0.2. One issue we found is when we read parq

[jira] [Created] (SPARK-20144) spark.read.parquet no long maintains the ordering the the data

2017-03-29 Thread Li Jin (JIRA)
Li Jin created SPARK-20144: -- Summary: spark.read.parquet no long maintains the ordering the the data Key: SPARK-20144 URL: https://issues.apache.org/jira/browse/SPARK-20144 Project: Spark Issue Typ

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2016-12-01 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712825#comment-15712825 ] Li Jin commented on SPARK-13534: [~bryanc], Allow me to introduce myself. I am Li Jin a

<    1   2   3