[jira] [Commented] (SPARK-26257) SPIP: Interop Support for Spark Language Extensions

2019-03-03 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782633#comment-16782633 ] Jeff Zhang commented on SPARK-26257: I know apache beam provide one abstraction layer for multiple

[jira] [Commented] (SPARK-22640) Can not switch python exec in executor side

2017-11-28 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270139#comment-16270139 ] Jeff Zhang commented on SPARK-22640: you need to use spark.yarn.appMasterEnv since you are using yarn

[jira] [Commented] (SPARK-22095) java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CALLBACK_HOST

2017-09-21 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175770#comment-16175770 ] Jeff Zhang commented on SPARK-22095: Could you tell how to reproduce this issue ? >

[jira] [Commented] (SPARK-21186) PySpark with --packages fails to import library due to lack of pythonpath to .ivy2/jars/*.jar

2017-07-01 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071254#comment-16071254 ] Jeff Zhang commented on SPARK-21186: I think this is due to how spark-deep-learning distribute its

[jira] [Updated] (SPARK-20249) Add summary for LinearSVCModel

2017-04-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-20249: --- Component/s: PySpark > Add summary for LinearSVCModel > -- > >

[jira] [Commented] (SPARK-20249) Add summary for LinearSVCModel

2017-04-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960540#comment-15960540 ] Jeff Zhang commented on SPARK-20249: Will work on it. > Add summary for LinearSVCModel >

[jira] [Created] (SPARK-20249) Add summary for LinearSVCModel

2017-04-07 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-20249: -- Summary: Add summary for LinearSVCModel Key: SPARK-20249 URL: https://issues.apache.org/jira/browse/SPARK-20249 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-20001) Support PythonRunner executing inside a Conda env

2017-03-18 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang edited comment on SPARK-20001 at 3/18/17 1:08 PM: - Thanks

[jira] [Comment Edited] (SPARK-20001) Support PythonRunner executing inside a Conda env

2017-03-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang edited comment on SPARK-20001 at 3/18/17 12:17 AM: -- Thanks

[jira] [Comment Edited] (SPARK-20001) Support PythonRunner executing inside a Conda env

2017-03-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang edited comment on SPARK-20001 at 3/18/17 12:14 AM: -- Thanks

[jira] [Issue Comment Deleted] (SPARK-20001) Support PythonRunner executing inside a Conda env

2017-03-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-20001: --- Comment: was deleted (was: Thanks [~dansanduleac] It looks like we are do similar things, recently I

[jira] [Commented] (SPARK-20001) Support PythonRunner executing inside a Conda env

2017-03-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930924#comment-15930924 ] Jeff Zhang commented on SPARK-20001: Thanks [~dansanduleac] It looks like we are do similar things,

[jira] [Commented] (SPARK-20001) Support PythonRunner executing inside a Conda env

2017-03-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930922#comment-15930922 ] Jeff Zhang commented on SPARK-20001: Thanks [~dansanduleac] It looks like we are do similar things,

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2017-03-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923683#comment-15923683 ] Jeff Zhang commented on SPARK-13587: I linked a detailed document about how to use it in both batch

[jira] [Commented] (SPARK-19439) PySpark's registerJavaFunction Should Support UDAFs

2017-03-08 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901290#comment-15901290 ] Jeff Zhang commented on SPARK-19439: Make sense, I will work on it. > PySpark's

[jira] [Created] (SPARK-19572) Allow to disable hive in sparkR shell

2017-02-13 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-19572: -- Summary: Allow to disable hive in sparkR shell Key: SPARK-19572 URL: https://issues.apache.org/jira/browse/SPARK-19572 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-19572) Allow to disable hive in sparkR shell

2017-02-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-19572: --- Description: SPARK-15236 do this for scala shell, this ticket is for sparkR shell. This is not only

[jira] [Updated] (SPARK-19570) Allow to disable hive in pyspark shell

2017-02-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-19570: --- Description: SPARK-15236 do this for scala shell, this ticket is for pyspark shell. This is not

[jira] [Updated] (SPARK-19570) Allow to disable hive in pyspark shell

2017-02-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-19570: --- Description: SPARK-15236 do this for scala shell, this ticket is for pyspark shell. This is not

[jira] [Updated] (SPARK-19570) Allow to disable hive in pyspark shell

2017-02-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-19570: --- Description: SPARK-15236 do this for scala shell, this ticket is for pyspark shell. > Allow to

[jira] [Created] (SPARK-19570) Allow to disable hive in pyspark shell

2017-02-12 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-19570: -- Summary: Allow to disable hive in pyspark shell Key: SPARK-19570 URL: https://issues.apache.org/jira/browse/SPARK-19570 Project: Spark Issue Type: Improvement

[jira] [Closed] (SPARK-19096) Kmeans.py application fails with virtualenv and due to parse error

2017-01-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang closed SPARK-19096. -- Resolution: Invalid Will do it in SPARK-13587 > Kmeans.py application fails with virtualenv and due

[jira] [Resolved] (SPARK-19095) virtualenv example does not work in yarn cluster mode

2017-01-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved SPARK-19095. Resolution: Invalid Will do it in SPARK-13587 > virtualenv example does not work in yarn cluster

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747407#comment-15747407 ] Jeff Zhang commented on SPARK-13587: If it is pretty large cluster, then I would suggest to set up a

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-12-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747338#comment-15747338 ] Jeff Zhang commented on SPARK-13587: [~prasanna.santha...@icloud.com] I don't understand how this can

[jira] [Updated] (SPARK-18786) pySpark SQLContext.getOrCreate(sc) take stopped sparkContext

2016-12-11 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18786: --- Component/s: PySpark > pySpark SQLContext.getOrCreate(sc) take stopped sparkContext >

[jira] [Comment Edited] (SPARK-18405) Add yarn-cluster mode support to Spark Thrift Server

2016-11-25 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696978#comment-15696978 ] Jeff Zhang edited comment on SPARK-18405 at 11/26/16 1:01 AM: -- I think he

[jira] [Commented] (SPARK-18405) Add yarn-cluster mode support to Spark Thrift Server

2016-11-25 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696978#comment-15696978 ] Jeff Zhang commented on SPARK-18405: I think he mean to launch multiple spark thrift server in

[jira] [Updated] (SPARK-18160) spark.files & spark.jars should not be passed to driver in yarn mode

2016-11-01 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Summary: spark.files & spark.jars should not be passed to driver in yarn mode (was: spark.files

[jira] [Updated] (SPARK-18160) spark.files should not passed to driver in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Summary: spark.files should not passed to driver in yarn-cluster mode (was: SparkContext.addFile

[jira] [Updated] (SPARK-18160) spark.files should not be passed to driver in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Summary: spark.files should not be passed to driver in yarn-cluster mode (was: spark.files should

[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Description: The following command will fails for spark 2.0 {noformat} bin/spark-submit --class

[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Description: The following command will fails for spark 2.0 {noformat} bin/spark-submit --class

[jira] [Updated] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Description: {noformat} bin/spark-submit --class org.apache.spark.examples.SparkPi --master

[jira] [Created] (SPARK-18160) SparkContext.addFile doesn't work in yarn-cluster mode

2016-10-28 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-18160: -- Summary: SparkContext.addFile doesn't work in yarn-cluster mode Key: SPARK-18160 URL: https://issues.apache.org/jira/browse/SPARK-18160 Project: Spark Issue

[jira] [Updated] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-10-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-16321: --- Component/s: (was: PySpark) SQL > [Spark 2.0] Performance regression when

[jira] [Commented] (SPARK-17904) Add a wrapper function to install R packages on each executors.

2016-10-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571366#comment-15571366 ] Jeff Zhang commented on SPARK-17904: It make sense to provide such api to install packages, one

[jira] [Created] (SPARK-17605) Add option spark.usePython and spark.useR for applications that use both pyspark and sparkr

2016-09-20 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17605: -- Summary: Add option spark.usePython and spark.useR for applications that use both pyspark and sparkr Key: SPARK-17605 URL: https://issues.apache.org/jira/browse/SPARK-17605

[jira] [Closed] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-09-19 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang closed SPARK-17054. -- Resolution: Won't Fix Close it as it is resolved somewhere else. > SparkR can not run in

[jira] [Commented] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-08 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475664#comment-15475664 ] Jeff Zhang commented on SPARK-17428: Found another elegant way to specify version, using devtools

[jira] [Commented] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-08 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475645#comment-15475645 ] Jeff Zhang commented on SPARK-17428: I just link the jira of python virtualenv. It seems R support

[jira] [Commented] (SPARK-17428) SparkR executors/workers support virtualenv

2016-09-08 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475630#comment-15475630 ] Jeff Zhang commented on SPARK-17428: Source code url needs to be specified for version.

[jira] [Commented] (SPARK-17261) Using HiveContext after re-creating SparkContext in Spark 2.0 throws "Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext"

2016-08-29 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15444981#comment-15444981 ] Jeff Zhang commented on SPARK-17261: It works if you change 'sc._instantiatedContext = None' to

[jira] [Commented] (SPARK-17261) Using HiveContext after re-creating SparkContext in Spark 2.0 throws "Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext"

2016-08-28 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15444694#comment-15444694 ] Jeff Zhang commented on SPARK-17261: [~dongjoon] spark-shell works well for me. It seems your case is

[jira] [Created] (SPARK-17210) sparkr.zip is not distributed to executors when run sparkr in RStudio

2016-08-24 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17210: -- Summary: sparkr.zip is not distributed to executors when run sparkr in RStudio Key: SPARK-17210 URL: https://issues.apache.org/jira/browse/SPARK-17210 Project: Spark

[jira] [Commented] (SPARK-14501) spark.ml parity for fpm - frequent items

2016-08-23 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432195#comment-15432195 ] Jeff Zhang commented on SPARK-14501: I didn't work on it now, as it is duplicate of SPARK-14503 >

[jira] [Updated] (SPARK-17157) Add multiclass logistic regression SparkR Wrapper

2016-08-22 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-17157: --- Issue Type: Sub-task (was: New Feature) Parent: SPARK-16442 > Add multiclass logistic

[jira] [Created] (SPARK-17178) Allow to set sparkr shell command through --conf

2016-08-21 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17178: -- Summary: Allow to set sparkr shell command through --conf Key: SPARK-17178 URL: https://issues.apache.org/jira/browse/SPARK-17178 Project: Spark Issue Type:

[jira] [Created] (SPARK-17125) Allow to specify spark config using non-string type in SparkR

2016-08-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17125: -- Summary: Allow to specify spark config using non-string type in SparkR Key: SPARK-17125 URL: https://issues.apache.org/jira/browse/SPARK-17125 Project: Spark

[jira] [Commented] (SPARK-17116) Allow params to be a {string, value} dict at fit time

2016-08-18 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425965#comment-15425965 ] Jeff Zhang commented on SPARK-17116: Is it possible to raise error if there's no such param exist

[jira] [Created] (SPARK-17121) Support _HOST replacement for principal

2016-08-17 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17121: -- Summary: Support _HOST replacement for principal Key: SPARK-17121 URL: https://issues.apache.org/jira/browse/SPARK-17121 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-17103) Can not define class variable in repl

2016-08-17 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17103: -- Summary: Can not define class variable in repl Key: SPARK-17103 URL: https://issues.apache.org/jira/browse/SPARK-17103 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-16 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423745#comment-15423745 ] Jeff Zhang commented on SPARK-17054: I push another commit to disable downloading spark if it is

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-16 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423741#comment-15423741 ] Jeff Zhang commented on SPARK-16578: Another scenario I'd like to clarify is that. Say we launch R

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422058#comment-15422058 ] Jeff Zhang commented on SPARK-17054: I have single node hadoop cluster in my laptop, and I run R

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421878#comment-15421878 ] Jeff Zhang commented on SPARK-16578: I think this feature can also be applied in pyspark. >

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421851#comment-15421851 ] Jeff Zhang commented on SPARK-17054: Here's the command I run. {code} bin/spark-submit --master

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421846#comment-15421846 ] Jeff Zhang commented on SPARK-17054: Do you run it as yarn-cluster mode ? > SparkR can not run in

[jira] [Commented] (SPARK-16578) Configurable hostname for RBackend

2016-08-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420749#comment-15420749 ] Jeff Zhang commented on SPARK-16578: I think one purpose of this ticket is to share the same

[jira] [Commented] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420553#comment-15420553 ] Jeff Zhang commented on SPARK-17054: Although I can fix it by using the correct cache dir for mac OS,

[jira] [Created] (SPARK-17054) SparkR can not run in yarn-cluster mode on mac os

2016-08-14 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-17054: -- Summary: SparkR can not run in yarn-cluster mode on mac os Key: SPARK-17054 URL: https://issues.apache.org/jira/browse/SPARK-17054 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16781) java launched by PySpark as gateway may not be the same java used in the spark environment

2016-08-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420547#comment-15420547 ] Jeff Zhang commented on SPARK-16781: JAVA_HOME will be set by yarn, not sure about other cluster

[jira] [Commented] (SPARK-15882) Discuss distributed linear algebra in spark.ml package

2016-08-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418585#comment-15418585 ] Jeff Zhang commented on SPARK-15882: I think it is better to keep RDD api underneath as I don't see

[jira] [Updated] (SPARK-16965) Fix bound checking for SparseVector

2016-08-08 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-16965: --- Component/s: PySpark MLlib > Fix bound checking for SparseVector >

[jira] [Created] (SPARK-16965) Fix bound checking for SparseVector

2016-08-08 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-16965: -- Summary: Fix bound checking for SparseVector Key: SPARK-16965 URL: https://issues.apache.org/jira/browse/SPARK-16965 Project: Spark Issue Type: Bug Affects

[jira] [Commented] (SPARK-16890) substring returns wrong result for positive position

2016-08-04 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407550#comment-15407550 ] Jeff Zhang commented on SPARK-16890: If I remember correctly, it is by design. Because in the sql

[jira] [Commented] (SPARK-16367) Wheelhouse Support for PySpark

2016-07-06 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364972#comment-15364972 ] Jeff Zhang commented on SPARK-16367: [~gae...@xeberon.net] I still don't understand how the binary

[jira] [Commented] (SPARK-16367) Wheelhouse Support for PySpark

2016-07-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363509#comment-15363509 ] Jeff Zhang commented on SPARK-16367: Preparing the wheelhouse seems time consuming to me especially

[jira] [Commented] (SPARK-16367) Wheelhouse Support for PySpark

2016-07-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363513#comment-15363513 ] Jeff Zhang commented on SPARK-16367: Oh, happen to find this project to build local python package

[jira] [Commented] (SPARK-16367) Wheelhouse Support for PySpark

2016-07-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362764#comment-15362764 ] Jeff Zhang commented on SPARK-16367: [~gae...@xeberon.net] Thanks for the new idea, this makes the

[jira] [Commented] (SPARK-16324) regexp_extract returns empty string when match fails

2016-07-01 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359361#comment-15359361 ] Jeff Zhang commented on SPARK-16324: I think this is by design {code} override def nullSafeEval(s:

[jira] [Commented] (SPARK-16321) Pyspark 2.0 performance drop vs pyspark 1.6

2016-07-01 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359339#comment-15359339 ] Jeff Zhang commented on SPARK-16321: This could due to a lot things, may be reading parquet file,

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-06-27 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15351512#comment-15351512 ] Jeff Zhang commented on SPARK-13587: Thanks [~gae...@xeberon.net] Have you take a look at my PR ?

[jira] [Comment Edited] (SPARK-16168) Spark sql can not read ORC table

2016-06-23 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346487#comment-15346487 ] Jeff Zhang edited comment on SPARK-16168 at 6/23/16 2:10 PM: - I don't think

[jira] [Commented] (SPARK-16168) Spark sql can not read ORC table

2016-06-23 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346487#comment-15346487 ] Jeff Zhang commented on SPARK-16168: I don't think it is spark issue, it is more likely your query

[jira] [Commented] (SPARK-15345) SparkSession's conf doesn't take effect when there's already an existing SparkContext

2016-06-22 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345652#comment-15345652 ] Jeff Zhang commented on SPARK-15345: It has been resolved in

[jira] [Closed] (SPARK-15345) SparkSession's conf doesn't take effect when there's already an existing SparkContext

2016-06-22 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang closed SPARK-15345. -- Resolution: Fixed > SparkSession's conf doesn't take effect when there's already an existing >

[jira] [Issue Comment Deleted] (SPARK-15705) Spark won't read ORC schema from metastore for partitioned tables

2016-06-22 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-15705: --- Comment: was deleted (was: I will take a look at it. ) > Spark won't read ORC schema from metastore

[jira] [Commented] (SPARK-15705) Spark won't read ORC schema from metastore for partitioned tables

2016-06-22 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345533#comment-15345533 ] Jeff Zhang commented on SPARK-15705: I will take a look at it. > Spark won't read ORC schema from

[jira] [Commented] (SPARK-16065) Throw a exception "java.lang.ClassNotFoundException" when run the spark-submit

2016-06-22 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343953#comment-15343953 ] Jeff Zhang commented on SPARK-16065: Would you mind to paste the codes around line 22 of test.scala ?

[jira] [Commented] (SPARK-16013) Add option to disable HiveContext in spark-shell/pyspark

2016-06-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335747#comment-15335747 ] Jeff Zhang commented on SPARK-16013: Found SPARK-11562, although it is not necessary in spark 2.0, I

[jira] [Commented] (SPARK-16013) Add option to disable HiveContext in spark-shell/pyspark

2016-06-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335727#comment-15335727 ] Jeff Zhang commented on SPARK-16013: I mean to introduce this to 1.6 as in spark 2.0 we can disable

[jira] [Comment Edited] (SPARK-16013) Add option to disable HiveContext in spark-shell/pyspark

2016-06-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335727#comment-15335727 ] Jeff Zhang edited comment on SPARK-16013 at 6/17/16 8:53 AM: - I mean to

[jira] [Created] (SPARK-16013) Add option to disable HiveContext in spark-shell/pyspark

2016-06-17 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-16013: -- Summary: Add option to disable HiveContext in spark-shell/pyspark Key: SPARK-16013 URL: https://issues.apache.org/jira/browse/SPARK-16013 Project: Spark Issue

[jira] [Commented] (SPARK-15993) PySpark RuntimeConfig should be immutable

2016-06-16 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335218#comment-15335218 ] Jeff Zhang commented on SPARK-15993: RuntimeConfig in scala api is mutable, if it doesn't work in

[jira] [Commented] (SPARK-15909) PySpark classpath uri incorrectly set

2016-06-16 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333289#comment-15333289 ] Jeff Zhang commented on SPARK-15909: If I remember correctly, pyspark can only run cluster mode in

[jira] [Commented] (SPARK-15930) Add Row count property to FPGrowth model

2016-06-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329074#comment-15329074 ] Jeff Zhang commented on SPARK-15930: I see, I guess you are trying to get the total number of

[jira] [Commented] (SPARK-15930) Add Row count property to FPGrowth model

2016-06-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329059#comment-15329059 ] Jeff Zhang commented on SPARK-15930: Don't we can get the count from freqItemsets in FPGrowthModel ?

[jira] [Commented] (SPARK-14503) spark.ml API for FPGrowth

2016-06-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326977#comment-15326977 ] Jeff Zhang commented on SPARK-14503: [~GayathriMurali] [~yuhaoyan] Do you still work on this ? If

[jira] [Updated] (SPARK-15819) Add KMeanSummary in KMeans of PySpark

2016-06-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-15819: --- Component/s: PySpark ML > Add KMeanSummary in KMeans of PySpark >

[jira] [Closed] (SPARK-15751) Add generateAssociationRules in fpm in pyspark

2016-06-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang closed SPARK-15751. -- Resolution: Won't Fix > Add generateAssociationRules in fpm in pyspark >

[jira] [Commented] (SPARK-14501) spark.ml parity for fpm - frequent items

2016-06-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325515#comment-15325515 ] Jeff Zhang commented on SPARK-14501: working on it. > spark.ml parity for fpm - frequent items >

[jira] [Comment Edited] (SPARK-13587) Support virtualenv in PySpark

2016-06-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324323#comment-15324323 ] Jeff Zhang edited comment on SPARK-13587 at 6/10/16 12:01 PM: -- Sorry, guys,

[jira] [Comment Edited] (SPARK-13587) Support virtualenv in PySpark

2016-06-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324323#comment-15324323 ] Jeff Zhang edited comment on SPARK-13587 at 6/10/16 11:43 AM: -- Sorry, guys,

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2016-06-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324323#comment-15324323 ] Jeff Zhang commented on SPARK-13587: Sorry, guys, I am busy on other stuff recently and late for

[jira] [Created] (SPARK-15819) Add KMeanSummary in KMeans of PySpark

2016-06-08 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-15819: -- Summary: Add KMeanSummary in KMeans of PySpark Key: SPARK-15819 URL: https://issues.apache.org/jira/browse/SPARK-15819 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-15803) Support with statement syntax for SparkSession

2016-06-07 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-15803: -- Summary: Support with statement syntax for SparkSession Key: SPARK-15803 URL: https://issues.apache.org/jira/browse/SPARK-15803 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3451) spark-submit should support specifying glob wildcards in the --jars CLI option

2016-06-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317975#comment-15317975 ] Jeff Zhang commented on SPARK-3451: --- +1 for this feature, or allow specifying jar folder. >

[jira] [Comment Edited] (SPARK-15779) SQL context fails when Hive uses Tez as its default execution engine

2016-06-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317959#comment-15317959 ] Jeff Zhang edited comment on SPARK-15779 at 6/7/16 6:27 AM: Actually it is

[jira] [Commented] (SPARK-15779) SQL context fails when Hive uses Tez as its default execution engine

2016-06-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317959#comment-15317959 ] Jeff Zhang commented on SPARK-15779: You need to specify hive.execution.engine=mr in your

  1   2   3   4   >