[jira] [Commented] (SPARK-1015) Visualize the DAG of RDD

2015-02-26 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339690#comment-14339690 ] Jeff Zhang commented on SPARK-1015: --- [~sowen] I may not have time for this recently.

[jira] [Commented] (SPARK-6653) New configuration property to specify port for sparkYarnAM actor system

2015-06-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589699#comment-14589699 ] Jeff Zhang commented on SPARK-6653: --- Although this is already committed, would it be

[jira] [Commented] (SPARK-4311) ContainerLauncher setting up executor -- invalid Xms settings (-Xms0m -Xmx0m)

2015-08-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694619#comment-14694619 ] Jeff Zhang commented on SPARK-4311: --- [~rafa.alfaro] I couldn't reproduce this issue.

[jira] [Commented] (SPARK-2971) Orphaned YARN ApplicationMaster lingers forever

2015-08-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694608#comment-14694608 ] Jeff Zhang commented on SPARK-2971: --- Looks like it has been resolved. {code}

[jira] [Comment Edited] (SPARK-2971) Orphaned YARN ApplicationMaster lingers forever

2015-08-12 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694608#comment-14694608 ] Jeff Zhang edited comment on SPARK-2971 at 8/13/15 3:05 AM:

[jira] [Commented] (SPARK-9195) RDD/Storage metrics don't update cached partition counts when executors are removed/lost

2015-08-19 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702777#comment-14702777 ] Jeff Zhang commented on SPARK-9195: --- Find more issues on RDD/Storage UI * Column Size in

[jira] [Comment Edited] (SPARK-9195) RDD/Storage metrics don't update cached partition counts when executors are removed/lost

2015-08-19 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702777#comment-14702777 ] Jeff Zhang edited comment on SPARK-9195 at 8/19/15 9:52 AM:

[jira] [Commented] (SPARK-8167) Tasks that fail due to YARN preemption can cause job failure

2015-07-30 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647313#comment-14647313 ] Jeff Zhang commented on SPARK-8167: --- [~mcheah] What's the status of this ticket ? I

[jira] [Created] (SPARK-11279) Add DataFrame#toDF in PySpark

2015-10-23 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11279: -- Summary: Add DataFrame#toDF in PySpark Key: SPARK-11279 URL: https://issues.apache.org/jira/browse/SPARK-11279 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-11342) Allow to set hadoop profile when running dev/run_tests

2015-10-27 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975906#comment-14975906 ] Jeff Zhang commented on SPARK-11342: Yes, it would be ideal to allow to set any available profile for

[jira] [Commented] (SPARK-11342) Allow to set hadoop profile when running dev/run_tests

2015-10-27 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975888#comment-14975888 ] Jeff Zhang commented on SPARK-11342: [~sowen] Isn't it also for local testing ? {code} if

[jira] [Created] (SPARK-11342) Allow to set hadoop profile when running dev/run_tests

2015-10-27 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11342: -- Summary: Allow to set hadoop profile when running dev/run_tests Key: SPARK-11342 URL: https://issues.apache.org/jira/browse/SPARK-11342 Project: Spark Issue

[jira] [Reopened] (SPARK-11102) Uninformative exception when specifing non-exist input for JSON data source

2015-10-21 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang reopened SPARK-11102: Reopen it > Uninformative exception when specifing non-exist input for JSON data source >

[jira] [Updated] (SPARK-10388) Public dataset loader interface

2015-11-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-10388: --- Attachment: SPARK-10388PublicDataSetLoaderInterface.pdf > Public dataset loader interface >

[jira] [Commented] (SPARK-10388) Public dataset loader interface

2015-11-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14998759#comment-14998759 ] Jeff Zhang commented on SPARK-10388: [~mengxr] I talked with [~rams] offline, and would love to

[jira] [Created] (SPARK-11622) Make LibSVMRelation extends HadoopFsRelation

2015-11-09 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11622: -- Summary: Make LibSVMRelation extends HadoopFsRelation Key: SPARK-11622 URL: https://issues.apache.org/jira/browse/SPARK-11622 Project: Spark Issue Type:

[jira] [Updated] (SPARK-11622) Make LibSVMRelation extends HadoopFsRelation and Add LibSVMOutputWriter

2015-11-10 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11622: --- Summary: Make LibSVMRelation extends HadoopFsRelation and Add LibSVMOutputWriter (was: Make

[jira] [Created] (SPARK-11691) Allow to specify compression codec in HadoopFsRelation when saving

2015-11-11 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11691: -- Summary: Allow to specify compression codec in HadoopFsRelation when saving Key: SPARK-11691 URL: https://issues.apache.org/jira/browse/SPARK-11691 Project: Spark

[jira] [Commented] (SPARK-11691) Allow to specify compression codec in HadoopFsRelation when saving

2015-11-11 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001801#comment-15001801 ] Jeff Zhang commented on SPARK-11691: Will create a PR soon. > Allow to specify compression codec in

[jira] [Commented] (SPARK-11725) Let UDF to handle null value

2015-11-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005208#comment-15005208 ] Jeff Zhang commented on SPARK-11725: Thanks [~hvanhovell] Should we prevent use primitive in UDF

[jira] [Commented] (SPARK-11725) Let UDF to handle null value

2015-11-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005399#comment-15005399 ] Jeff Zhang commented on SPARK-11725: I am on master > Let UDF to handle null value >

[jira] [Created] (SPARK-11747) Can not specify input path in python logistic_regression example under ml

2015-11-15 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11747: -- Summary: Can not specify input path in python logistic_regression example under ml Key: SPARK-11747 URL: https://issues.apache.org/jira/browse/SPARK-11747 Project: Spark

[jira] [Updated] (SPARK-11747) Can not specify input path in python logistic_regression example under ml

2015-11-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11747: --- Component/s: Examples > Can not specify input path in python logistic_regression example under ml >

[jira] [Updated] (SPARK-11747) Can not specify input path in python logistic_regression example under ml

2015-11-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11747: --- Description: Not sure why it is hard coded, it would be nice to allow user to specify input path

[jira] [Commented] (SPARK-11368) Spark shouldn't scan all partitions when using Python UDF and filter over partitioned column is given

2015-11-09 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997831#comment-14997831 ] Jeff Zhang commented on SPARK-11368: Looks like an issue in QueryPlan optimization step, will work on

[jira] [Commented] (SPARK-6517) Bisecting k-means clustering

2015-11-09 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997711#comment-14997711 ] Jeff Zhang commented on SPARK-6517: --- Yes, I'd love to do the follow up work. will create jira for Python

[jira] [Comment Edited] (SPARK-6517) Bisecting k-means clustering

2015-11-09 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997711#comment-14997711 ] Jeff Zhang edited comment on SPARK-6517 at 11/10/15 12:09 AM: -- Yes, I'd love

[jira] [Commented] (SPARK-6517) Bisecting k-means clustering

2015-11-09 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997728#comment-14997728 ] Jeff Zhang commented on SPARK-6517: --- Oh, got it, thanks for letting me know. > Bisecting k-means

[jira] [Commented] (SPARK-11145) Cannot filter using a partition key and another column

2015-11-05 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991314#comment-14991314 ] Jeff Zhang commented on SPARK-11145: I ran it on the master, seems it has been resolved. > Cannot

[jira] [Updated] (SPARK-11102) Unreadable exception when specifing non-exist input for JSON data source

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11102: --- Summary: Unreadable exception when specifing non-exist input for JSON data source (was: Not

[jira] [Commented] (SPARK-10861) Univariate Statistics: Adding range support as UDAF

2015-10-19 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963029#comment-14963029 ] Jeff Zhang commented on SPARK-10861: [~JihongMA] what's your progress on this ? > Univariate

[jira] [Commented] (SPARK-11125) Unreadable exception when running spark-sql without building with -Phive-thriftserver and SPARK_PREPEND_CLASSES is set

2015-10-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958397#comment-14958397 ] Jeff Zhang commented on SPARK-11125: Will create pull request soon. > Unreadable exception when

[jira] [Created] (SPARK-11125) Unreadable exception when running spark-sql without building with -Phive-thriftserver and SPARK_PREPEND_CLASSES is set

2015-10-15 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11125: -- Summary: Unreadable exception when running spark-sql without building with -Phive-thriftserver and SPARK_PREPEND_CLASSES is set Key: SPARK-11125 URL:

[jira] [Updated] (SPARK-11099) Default conf property file is not loaded

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11099: --- Description: spark.driver.extraClassPath doesn't take effect in the latest code, and find the root

[jira] [Created] (SPARK-11099) Default conf property file is not loaded

2015-10-14 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11099: -- Summary: Default conf property file is not loaded Key: SPARK-11099 URL: https://issues.apache.org/jira/browse/SPARK-11099 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-11099) Default conf property file is not loaded

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11099: --- Component/s: Spark Submit > Default conf property file is not loaded >

[jira] [Commented] (SPARK-11099) Default conf property file is not loaded

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956399#comment-14956399 ] Jeff Zhang commented on SPARK-11099: Will create a pull request soon > Default conf property file is

[jira] [Updated] (SPARK-11099) Default conf property file is not loaded

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11099: --- Component/s: Spark Shell > Default conf property file is not loaded >

[jira] [Updated] (SPARK-11102) Not readable exception when specifing non-exist input for JSON data source

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11102: --- Issue Type: Improvement (was: Bug) > Not readable exception when specifing non-exist input for JSON

[jira] [Created] (SPARK-11102) Not readable exception when specifing non-exist input for JSON data source

2015-10-14 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11102: -- Summary: Not readable exception when specifing non-exist input for JSON data source Key: SPARK-11102 URL: https://issues.apache.org/jira/browse/SPARK-11102 Project:

[jira] [Updated] (SPARK-11102) Not readable exception when specifing non-exist input for JSON data source

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11102: --- Priority: Minor (was: Major) > Not readable exception when specifing non-exist input for JSON data

[jira] [Commented] (SPARK-11102) Not readable exception when specifing non-exist input for JSON data source

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956491#comment-14956491 ] Jeff Zhang commented on SPARK-11102: Will create a pull request soon > Not readable exception when

[jira] [Updated] (SPARK-11099) Default conf property file is not loaded

2015-10-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11099: --- Affects Version/s: 1.5.1 > Default conf property file is not loaded >

[jira] [Updated] (SPARK-11102) Uninformative exception when specifing non-exist input for JSON data source

2015-10-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11102: --- Summary: Uninformative exception when specifing non-exist input for JSON data source (was:

[jira] [Commented] (SPARK-11205) Delegate to scala DataFrame API rather than print in python

2015-10-20 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964821#comment-14964821 ] Jeff Zhang commented on SPARK-11205: Will create PR soon. > Delegate to scala DataFrame API rather

[jira] [Created] (SPARK-11204) Delegate to scala DataFrame API rather than print in python

2015-10-20 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11204: -- Summary: Delegate to scala DataFrame API rather than print in python Key: SPARK-11204 URL: https://issues.apache.org/jira/browse/SPARK-11204 Project: Spark

[jira] [Updated] (SPARK-11205) Delegate to scala DataFrame API rather than print in python

2015-10-20 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11205: --- Description: When I use DataFrame#explain(), I found the output is a little different from scala

[jira] [Created] (SPARK-11205) Delegate to scala DataFrame API rather than print in python

2015-10-20 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11205: -- Summary: Delegate to scala DataFrame API rather than print in python Key: SPARK-11205 URL: https://issues.apache.org/jira/browse/SPARK-11205 Project: Spark

[jira] [Commented] (SPARK-2654) Leveled logging in PySpark

2015-10-20 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964586#comment-14964586 ] Jeff Zhang commented on SPARK-2654: --- [~davies] I think currently spark-core also don't have logging

[jira] [Commented] (SPARK-6517) Bisecting k-means clustering

2015-10-19 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964550#comment-14964550 ] Jeff Zhang commented on SPARK-6517: --- Is the work still going on ? If not, I'd like to help continue the

[jira] [Commented] (SPARK-9299) percentile and percentile_approx aggregate functions

2015-10-20 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964656#comment-14964656 ] Jeff Zhang commented on SPARK-9299: --- Link with SPARK-6761 as they can share the same algorithm >

[jira] [Updated] (SPARK-11205) Match the output of DataFrame#explain() in both scala api and python

2015-10-21 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11205: --- Summary: Match the output of DataFrame#explain() in both scala api and python (was: Delegate to

[jira] [Created] (SPARK-11226) Empty line in json file should be skipped

2015-10-21 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11226: -- Summary: Empty line in json file should be skipped Key: SPARK-11226 URL: https://issues.apache.org/jira/browse/SPARK-11226 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-11002) pyspark doesn't support UDAF

2015-10-20 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved SPARK-11002. Resolution: Duplicate Duplicate of SPARK-10915 > pyspark doesn't support UDAF >

[jira] [Updated] (SPARK-11798) Datanucleus jars is missing under lib_managed/jars

2015-11-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11798: --- Description: I notice the comments in https://github.com/apache/spark/pull/9575 said that

[jira] [Created] (SPARK-11798) Datanucleus jars is missing under lib_managed/jars

2015-11-17 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11798: -- Summary: Datanucleus jars is missing under lib_managed/jars Key: SPARK-11798 URL: https://issues.apache.org/jira/browse/SPARK-11798 Project: Spark Issue Type:

[jira] [Commented] (SPARK-11804) Exception raise when using Jdbc predicates option in PySpark

2015-11-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15010140#comment-15010140 ] Jeff Zhang commented on SPARK-11804: It is a bug in PySpark, working on it. > Exception raise when

[jira] [Updated] (SPARK-11804) Exception raise when using Jdbc predicates option in PySpark

2015-11-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11804: --- Priority: Minor (was: Major) > Exception raise when using Jdbc predicates option in PySpark >

[jira] [Created] (SPARK-11804) Exception raise when using Jdbc predicates option in PySpark

2015-11-17 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11804: -- Summary: Exception raise when using Jdbc predicates option in PySpark Key: SPARK-11804 URL: https://issues.apache.org/jira/browse/SPARK-11804 Project: Spark

[jira] [Updated] (SPARK-11804) Exception raise when using Jdbc predicates option in PySpark

2015-11-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11804: --- Priority: Major (was: Minor) > Exception raise when using Jdbc predicates option in PySpark >

[jira] [Created] (SPARK-11725) Let UDF to handle null value

2015-11-13 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11725: -- Summary: Let UDF to handle null value Key: SPARK-11725 URL: https://issues.apache.org/jira/browse/SPARK-11725 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-11725) Let UDF to handle null value

2015-11-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003836#comment-15003836 ] Jeff Zhang commented on SPARK-11725: And I found that PySpark will allow the UDF to handle null

[jira] [Commented] (SPARK-11725) Let UDF to handle null value

2015-11-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15003950#comment-15003950 ] Jeff Zhang commented on SPARK-11725: bq. So there is no way to express null; in these case scala will

[jira] [Commented] (SPARK-11775) Allow PySpark to register Java UDF

2015-11-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15008364#comment-15008364 ] Jeff Zhang commented on SPARK-11775: Working on it. > Allow PySpark to register Java UDF >

[jira] [Created] (SPARK-11775) Allow PySpark to register Java UDF

2015-11-17 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-11775: -- Summary: Allow PySpark to register Java UDF Key: SPARK-11775 URL: https://issues.apache.org/jira/browse/SPARK-11775 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-5185) pyspark --jars does not add classes to driver class path

2015-11-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15008360#comment-15008360 ] Jeff Zhang commented on SPARK-5185: --- I think pyspark --jars do put classes to driver class path. But the

[jira] [Updated] (SPARK-10481) SPARK_PREPEND_CLASSES make spark-yarn related jar could not be found

2015-09-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-10481: --- Description: It happens when SPARK_PREPEND_CLASSES is set and run spark on yarn. If

[jira] [Created] (SPARK-10481) SPARK_PREPEND_CLASSES make spark-yarn related jar could not be found

2015-09-07 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-10481: -- Summary: SPARK_PREPEND_CLASSES make spark-yarn related jar could not be found Key: SPARK-10481 URL: https://issues.apache.org/jira/browse/SPARK-10481 Project: Spark

[jira] [Commented] (SPARK-10481) SPARK_PREPEND_CLASSES make spark-yarn related jar could not be found

2015-09-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734134#comment-14734134 ] Jeff Zhang commented on SPARK-10481: Working on it (Try to throw a more readable exception) >

[jira] [Updated] (SPARK-10481) SPARK_PREPEND_CLASSES make spark-yarn related jar could not be found

2015-09-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-10481: --- Description: It happens when SPARK_PREPEND_CLASSES is set and run spark on yarn. If

[jira] [Created] (SPARK-10526) Display cores/memory on ExecutorsTab

2015-09-09 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-10526: -- Summary: Display cores/memory on ExecutorsTab Key: SPARK-10526 URL: https://issues.apache.org/jira/browse/SPARK-10526 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-10530) Kill other task attempts when one taskattempt belonging the same task is succeeded in speculation

2015-09-09 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-10530: -- Summary: Kill other task attempts when one taskattempt belonging the same task is succeeded in speculation Key: SPARK-10530 URL: https://issues.apache.org/jira/browse/SPARK-10530

[jira] [Commented] (SPARK-9790) [YARN] Expose in WebUI if NodeManager is the reason why executors were killed.

2015-09-09 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738187#comment-14738187 ] Jeff Zhang commented on SPARK-9790: --- Pretty useful feature IMO, any progress on it ? > [YARN] Expose in

[jira] [Created] (SPARK-10531) AppId is set as AppName in status rest api

2015-09-10 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-10531: -- Summary: AppId is set as AppName in status rest api Key: SPARK-10531 URL: https://issues.apache.org/jira/browse/SPARK-10531 Project: Spark Issue Type:

[jira] [Closed] (SPARK-12092) StringIndexer failing with Unseen label exception on test data

2015-12-02 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang closed SPARK-12092. -- Resolution: Won't Fix > StringIndexer failing with Unseen label exception on test data >

[jira] [Commented] (SPARK-11940) Python API for ml.clustering.LDA

2015-12-02 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037391#comment-15037391 ] Jeff Zhang commented on SPARK-11940: Looks like there's even no scala api under ml for LDA. Will

[jira] [Commented] (SPARK-11940) Python API for ml.clustering.LDA

2015-12-02 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037413#comment-15037413 ] Jeff Zhang commented on SPARK-11940: Thanks [~yanboliang] didn't rebase my repository :) > Python

[jira] [Issue Comment Deleted] (SPARK-11940) Python API for ml.clustering.LDA

2015-12-02 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-11940: --- Comment: was deleted (was: Thanks [~yanboliang] didn't rebase my repository :)) > Python API for

[jira] [Commented] (SPARK-11940) Python API for ml.clustering.LDA

2015-12-02 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037414#comment-15037414 ] Jeff Zhang commented on SPARK-11940: Thanks [~yanboliang] didn't rebase my repository :) > Python

[jira] [Created] (SPARK-12119) Support compression in PySpark

2015-12-03 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-12119: -- Summary: Support compression in PySpark Key: SPARK-12119 URL: https://issues.apache.org/jira/browse/SPARK-12119 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-12120) Improve exception message when failing to initialize HiveContext in PySpark

2015-12-03 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-12120: -- Summary: Improve exception message when failing to initialize HiveContext in PySpark Key: SPARK-12120 URL: https://issues.apache.org/jira/browse/SPARK-12120 Project:

[jira] [Updated] (SPARK-12120) Improve exception message when failing to initialize HiveContext in PySpark

2015-12-03 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-12120: --- Description: I get the following exception message when failing to initialize HiveContext. This is

[jira] [Created] (SPARK-12166) Unset hadoop related environment in testing

2015-12-06 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-12166: -- Summary: Unset hadoop related environment in testing Key: SPARK-12166 URL: https://issues.apache.org/jira/browse/SPARK-12166 Project: Spark Issue Type:

[jira] [Updated] (SPARK-12166) Unset hadoop related environment in testing

2015-12-06 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-12166: --- Priority: Minor (was: Major) > Unset hadoop related environment in testing >

[jira] [Created] (SPARK-12086) Support multiple input paths for LibSVMRelation

2015-12-01 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-12086: -- Summary: Support multiple input paths for LibSVMRelation Key: SPARK-12086 URL: https://issues.apache.org/jira/browse/SPARK-12086 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12092) StringIndexer failing with Unseen label exception on test data

2015-12-01 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035433#comment-15035433 ] Jeff Zhang commented on SPARK-12092: Looks like it is been resolved in SPARK-8764 > StringIndexer

[jira] [Commented] (SPARK-12045) Use joda's DateTime to replace Calendar

2015-12-01 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035333#comment-15035333 ] Jeff Zhang commented on SPARK-12045: [~cloud_fan] I saw you are on the history of DateTimeUtils, so

[jira] [Commented] (SPARK-12045) Use joda's DateTime to replace Calendar

2015-12-07 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046056#comment-15046056 ] Jeff Zhang commented on SPARK-12045: bq. Our general policy for exceptions is that we return null for

[jira] [Commented] (SPARK-4591) Add algorithm/model wrappers in spark.ml to adapt the new API

2015-12-09 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049974#comment-15049974 ] Jeff Zhang commented on SPARK-4591: --- Should this be closed ? Seems many algorithms have been ported, and

[jira] [Commented] (SPARK-12180) DataFrame.join() in PySpark gives misleading exception when column name exists on both side

2015-12-16 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061320#comment-15061320 ] Jeff Zhang commented on SPARK-12180: Simulate your sample code, but it works for me. But I am on

[jira] [Commented] (SPARK-12384) Allow -Xms to be set differently then -Xmx

2015-12-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061835#comment-15061835 ] Jeff Zhang commented on SPARK-12384: Correct, the memory is controlled by cluster manager, set

[jira] [Commented] (SPARK-4497) HiveThriftServer2 does not exit properly on failure

2015-12-14 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055924#comment-15055924 ] Jeff Zhang commented on SPARK-4497: --- Can not reproduce it. [~yanakad] Is this still an issue for you ?

[jira] [Created] (SPARK-12334) Support read from multiple input paths for orc file in DataFrameReader.orc

2015-12-15 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-12334: -- Summary: Support read from multiple input paths for orc file in DataFrameReader.orc Key: SPARK-12334 URL: https://issues.apache.org/jira/browse/SPARK-12334 Project:

[jira] [Updated] (SPARK-12334) Support read from multiple input paths for orc file in DataFrameReader.orc

2015-12-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-12334: --- Component/s: PySpark > Support read from multiple input paths for orc file in DataFrameReader.orc >

[jira] [Updated] (SPARK-12334) Support read from multiple input paths for orc file in DataFrameReader.orc

2015-12-15 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-12334: --- Affects Version/s: 1.6.0 Target Version/s: 1.6.1 > Support read from multiple input paths for

[jira] [Commented] (SPARK-12180) DataFrame.join() in PySpark gives misleading exception when column name exists on both side

2015-12-13 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055429#comment-15055429 ] Jeff Zhang commented on SPARK-12180: Could you paste your code ? It works fine for me to join 2

[jira] [Commented] (SPARK-12384) Allow -Xms to be set differently then -Xmx

2015-12-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062069#comment-15062069 ] Jeff Zhang commented on SPARK-12384: OK, get it. You mean the driver side. > Allow -Xms to be set

[jira] [Commented] (SPARK-12420) Have a built-in CSV data source implementation

2015-12-18 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063736#comment-15063736 ] Jeff Zhang commented on SPARK-12420: +1, this is very common use data format. Not sure why it is not

[jira] [Comment Edited] (SPARK-12420) Have a built-in CSV data source implementation

2015-12-18 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063736#comment-15063736 ] Jeff Zhang edited comment on SPARK-12420 at 12/18/15 9:15 AM: -- +1, this is

[jira] [Created] (SPARK-12318) Save mode in SparkR should be error by default

2015-12-14 Thread Jeff Zhang (JIRA)
Jeff Zhang created SPARK-12318: -- Summary: Save mode in SparkR should be error by default Key: SPARK-12318 URL: https://issues.apache.org/jira/browse/SPARK-12318 Project: Spark Issue Type: Bug

  1   2   3   4   >