Justin Uang created SPARK-25493:
---
Summary: CRLF Line Separators don't work in multiline CSVs
Key: SPARK-25493
URL: https://issues.apache.org/jira/browse/SPARK-25493
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237056#comment-15237056
]
Justin Uang commented on SPARK-9850:
I like this idea a lot. One thing we encounter in our use cases
[
https://issues.apache.org/jira/browse/SPARK-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15226141#comment-15226141
]
Justin Uang commented on SPARK-2183:
Yup, we're hitting this as well
> Avoid loading/shuffling data
[
https://issues.apache.org/jira/browse/SPARK-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125498#comment-15125498
]
Justin Uang commented on SPARK-9141:
Does your explain() string grow exponentially w.r.t. to the
[
https://issues.apache.org/jira/browse/SPARK-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121275#comment-15121275
]
Justin Uang commented on SPARK-9301:
Do we have a plan on how to implement these in native spark sql?
[
https://issues.apache.org/jira/browse/SPARK-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121476#comment-15121476
]
Justin Uang commented on SPARK-9301:
Yea, my workaround has been json'ifying the struct into a string
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059055#comment-15059055
]
Justin Uang edited comment on SPARK-10915 at 12/15/15 11:07 PM:
An
[
https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059055#comment-15059055
]
Justin Uang commented on SPARK-10915:
-
An abstract base class would be fine, or something like
Justin Uang created SPARK-12157:
---
Summary: Support numpy types as return values of Python UDFs
Key: SPARK-12157
URL: https://issues.apache.org/jira/browse/SPARK-12157
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15043594#comment-15043594
]
Justin Uang commented on SPARK-12157:
-
Good question, scala types would be good enough for this
Justin Uang created SPARK-10915:
---
Summary: Add support for UDAFs in Python
Key: SPARK-10915
URL: https://issues.apache.org/jira/browse/SPARK-10915
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744731#comment-14744731
]
Justin Uang commented on SPARK-9313:
This would be hugely helpful. I'm working on a platform that
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735406#comment-14735406
]
Justin Uang commented on SPARK-8632:
Davies, what do you mean by upstream? I didn't quite understand
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735749#comment-14735749
]
Justin Uang commented on SPARK-8632:
I set the batch mode to be 100, which is the same as before
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736022#comment-14736022
]
Justin Uang commented on SPARK-8632:
Just pushed, any comments would be much appreciated. I didn't
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14733867#comment-14733867
]
Justin Uang commented on SPARK-8632:
Yea, I think that's the best solution for udfs, since the number
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734160#comment-14734160
]
Justin Uang commented on SPARK-8632:
I have a solution working on my computer. I'm going to clean it
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734000#comment-14734000
]
Justin Uang commented on SPARK-8632:
I have started working on this. I hope to get a draft out soon.
Justin Uang created SPARK-10447:
---
Summary: Upgrade pyspark to use py4j 0.9
Key: SPARK-10447
URL: https://issues.apache.org/jira/browse/SPARK-10447
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731164#comment-14731164
]
Justin Uang commented on SPARK-10447:
-
Agreed, I'm pretty sure that this will break some APIs and
[
https://issues.apache.org/jira/browse/SPARK-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731592#comment-14731592
]
Justin Uang commented on SPARK-10447:
-
Sure, I wouldn't mind doing the code review. Can you add me?
[
https://issues.apache.org/jira/browse/SPARK-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731598#comment-14731598
]
Justin Uang commented on SPARK-10447:
-
Sound good
> Upgrade pyspark to use py4j 0.9
>
[
https://issues.apache.org/jira/browse/SPARK-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649437#comment-14649437
]
Justin Uang commented on SPARK-9141:
(Taken from spark dev email:
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612608#comment-14612608
]
Justin Uang commented on SPARK-8632:
Haven't gotten around to it yet. I'll let you
[
https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601288#comment-14601288
]
Justin Uang commented on SPARK-8632:
[~davies], my current plan is to switch to a
Justin Uang created SPARK-8632:
--
Summary: Poor Python UDF performance because of RDD caching
Key: SPARK-8632
URL: https://issues.apache.org/jira/browse/SPARK-8632
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591817#comment-14591817
]
Justin Uang commented on SPARK-595:
---
Sure, I'll get to it after I finish some tasks for
[
https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590486#comment-14590486
]
Justin Uang commented on SPARK-595:
---
+1 We are using for internal testing to ensure that
[
https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566089#comment-14566089
]
Justin Uang commented on SPARK-7899:
Can we get this back ported into spark 1.4 or is
[
https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561709#comment-14561709
]
Justin Uang commented on SPARK-7899:
Building upon michael's comment, the reason it
[
https://issues.apache.org/jira/browse/SPARK-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561732#comment-14561732
]
Justin Uang commented on SPARK-7899:
Building upon michael's comment, the reason it
[
https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555151#comment-14555151
]
Justin Uang commented on SPARK-7768:
Agreed. For example, we wanted to add a
[
https://issues.apache.org/jira/browse/SPARK-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509209#comment-14509209
]
Justin Uang commented on SPARK-6999:
We might be able to use
[
https://issues.apache.org/jira/browse/SPARK-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502994#comment-14502994
]
Justin Uang commented on SPARK-6999:
Looking at the source, it looks like one way to
[
https://issues.apache.org/jira/browse/SPARK-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Justin Uang updated SPARK-6999:
---
Description:
It looks like
{code}
def createDataFrame(rowRDD: JavaRDD[Row], columns:
Justin Uang created SPARK-6999:
--
Summary: infinite recursion with createDataFrame(JavaRDD[Row],
java.util.List[String])
Key: SPARK-6999
URL: https://issues.apache.org/jira/browse/SPARK-6999
Project:
36 matches
Mail list logo