[jira] [Commented] (SPARK-3374) Spark on Yarn remove deprecated configs for 2.0

2016-01-20 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110053#comment-15110053 ] Felix Cheung commented on SPARK-3374: - +1 esp on config handling. SPARK-12343 gets messy with places

Spark Talks: Using MLib to Predict Popular Tweets & Using Zeppelin Notebooks

2016-01-19 Thread Felix Cheung
FYI _ *Note, expedite your check in at Galvanize and register here Talk 1: Using Spark MLlib To Predict Most Popular Tweets Spark's Machine Learning Library (MLlib) enables running Machine Learning algorithms in a scalable way on massive

Re: SparkR with Hive integration

2016-01-19 Thread Felix Cheung
You might need hive-site.xml _ From: Peter Zhang Sent: Monday, January 18, 2016 9:08 PM Subject: Re: SparkR with Hive integration To: Jeff Zhang Cc: Thanks,  I will try.

[jira] [Commented] (SPARK-12790) Remove HistoryServer old multiple files format

2016-01-19 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108138#comment-15108138 ] Felix Cheung commented on SPARK-12790: -- I have made the changes but running tests, I can't get

[jira] [Commented] (SPARK-12790) Remove HistoryServer old multiple files format

2016-01-19 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108140#comment-15108140 ] Felix Cheung commented on SPARK-12790: -- FsHistoryProviderSuite, which I did change, is passing

Re: SparkContext SyntaxError: invalid syntax

2016-01-19 Thread Felix Cheung
roperty for setting environment variables. On Sun, Jan 17, 2016 at 11:37 PM, Felix Cheung <felixcheun...@hotmail.com> wrote: > Do you still need help on the PR? > btw, does this apply to YARN client mode? > > -- > From: andrewweiner2...@u.northweste

RE: SparkContext SyntaxError: invalid syntax

2016-01-17 Thread Felix Cheung
Do you still need help on the PR? btw, does this apply to YARN client mode? From: andrewweiner2...@u.northwestern.edu Date: Sun, 17 Jan 2016 17:00:39 -0600 Subject: Re: SparkContext SyntaxError: invalid syntax To: cutl...@gmail.com CC: user@spark.apache.org Yeah, I do think it would be worth

[jira] [Commented] (SPARK-12343) Remove YARN Client / ClientArguments

2016-01-17 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103915#comment-15103915 ] Felix Cheung commented on SPARK-12343: -- Should deploy/Client be separately removed too? https

[jira] [Commented] (SPARK-12790) Remove HistoryServer old multiple files format

2016-01-17 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103856#comment-15103856 ] Felix Cheung commented on SPARK-12790: -- is this simply removing the code path

RE: Are we running SparkR tests in Jenkins?

2016-01-17 Thread Felix Cheung
I think that breaks sparkR, the commandline script, and Jenkins, in which run-test.sh is calling sparkR. I'll work on this - since this also affects my PR #10652... Date: Fri, 15 Jan 2016 15:33:13 -0800 Subject: Re: Are we running SparkR tests in Jenkins? From: zjf...@gmail.com To:

[jira] [Created] (SPARK-12862) Jenkins does not run R tests

2016-01-17 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12862: Summary: Jenkins does not run R tests Key: SPARK-12862 URL: https://issues.apache.org/jira/browse/SPARK-12862 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-12862) Jenkins does not run R tests

2016-01-17 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103633#comment-15103633 ] Felix Cheung commented on SPARK-12862: -- mail thread: http://apache-spark-developers-list.1001551.n3

Re: [VOTE] Release Apache Zeppelin (incubating) 0.5.6-incubating (RC2)

2016-01-14 Thread Felix Cheung
Best, Jungtaek Lim (HeartSaVioR) On Thu, Jan 14, 2016 at 2:53 PM, Corneau Damien <cornead...@gmail.com> wrote: > +1 (binding) > > On Thu, Jan 14, 2016 at 2:38 PM, moon soo Lee <m...@apache.org> wrote: > > > +1 (binding)

[jira] [Created] (ZEPPELIN-602) elasticsearch throws ArrayIndexOutOfBoundsException for interpreting an empty paragraph

2016-01-12 Thread Felix Cheung (JIRA)
Felix Cheung created ZEPPELIN-602: - Summary: elasticsearch throws ArrayIndexOutOfBoundsException for interpreting an empty paragraph Key: ZEPPELIN-602 URL: https://issues.apache.org/jira/browse/ZEPPELIN-602

[jira] [Created] (ZEPPELIN-601) import note, add from url textbox is not clearing buttons from previous choices

2016-01-12 Thread Felix Cheung (JIRA)
Felix Cheung created ZEPPELIN-601: - Summary: import note, add from url textbox is not clearing buttons from previous choices Key: ZEPPELIN-601 URL: https://issues.apache.org/jira/browse/ZEPPELIN-601

RE: [VOTE] Release Apache Zeppelin (incubating) 0.5.6-incubating (RC2)

2016-01-12 Thread Felix Cheung
+1 Tested: packaged spark, external sparkSpark/Scala, pyspark, md, sh, flink, various UI features Found and filed these bugs, I don't think they are blockers: https://issues.apache.org/jira/browse/ZEPPELIN-599 notebook search should search paragraph

[jira] [Created] (ZEPPELIN-600) notebook search should have a way to clear search and return to the previous page

2016-01-12 Thread Felix Cheung (JIRA)
Felix Cheung created ZEPPELIN-600: - Summary: notebook search should have a way to clear search and return to the previous page Key: ZEPPELIN-600 URL: https://issues.apache.org/jira/browse/ZEPPELIN-600

[jira] [Created] (ZEPPELIN-599) notebook search should search paragraph title

2016-01-12 Thread Felix Cheung (JIRA)
Felix Cheung created ZEPPELIN-599: - Summary: notebook search should search paragraph title Key: ZEPPELIN-599 URL: https://issues.apache.org/jira/browse/ZEPPELIN-599 Project: Zeppelin Issue

RE: sparkR ORC support.

2016-01-12 Thread Felix Cheung
c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))library(SparkR) sc <<- sparkR.init()sc <<- sparkRHive.init()hivecontext <<- sparkRHive.init(sc)df <- loadDF(hivecontext, "/data/ingest/sparktest1/", "orc")#View(df)

Re: sparkR ORC support.

2016-01-12 Thread Felix Cheung
would need to call the line hivecontext <- sparkRHive.init(sc) again. _ From: Sandeep Khurana <sand...@infoworks.io> Sent: Tuesday, January 12, 2016 5:20 AM Subject: Re: sparkR ORC support. To: Felix Cheung <felixcheun...@hotmail.com> Cc: spark users

[jira] [Updated] (SPARK-5162) Python yarn-cluster mode

2016-01-07 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-5162: Description: 2Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would

[jira] [Created] (SPARK-12699) R driver process should start in a clean state

2016-01-07 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12699: Summary: R driver process should start in a clean state Key: SPARK-12699 URL: https://issues.apache.org/jira/browse/SPARK-12699 Project: Spark Issue Type

[jira] [Updated] (SPARK-5162) Python yarn-cluster mode

2016-01-07 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-5162: Description: Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would be great

Re: sparkR ORC support.

2016-01-06 Thread Felix Cheung
pi.r.SQLUtils", "loadDF", sqlContext, > source, options) > 2 > read.df(sqlContext, filepath, "orc") at > spark_api.R#108 > > On Wed, Jan 6, 2016 at 10:30 AM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > >> Firstly I don't have ORC data t

[jira] [Commented] (SPARK-12609) Make R to JVM timeout configurable

2016-01-05 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083506#comment-15083506 ] Felix Cheung commented on SPARK-12609: -- It looks like the timeout to socketConnection is merely

[jira] [Commented] (SPARK-11139) Make SparkContext.stop() exception-safe

2016-01-05 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085172#comment-15085172 ] Felix Cheung commented on SPARK-11139: -- it looks like this might have been resolved by https

Re: pyspark Dataframe and histogram through ggplot (python)

2016-01-05 Thread Felix Cheung
Hi, select() returns a new Spark DataFrame; I would imagine ggplot would not work with it. Could you try df.select("something").toPandas()? _ From: Snehotosh Banerjee Sent: Tuesday, January 5, 2016 4:32 AM Subject: pyspark Dataframe

Re: sparkR ORC support.

2016-01-05 Thread Felix Cheung
Firstly I don't have ORC data to verify but this should work: df <- loadDF(sqlContext, "data/path", "orc") Secondly, could you check if sparkR.stop() was called? sparkRHive.init() should be called after sparkR.init() - please check if there is any error message there.

[jira] [Commented] (SPARK-12609) Make R to JVM timeout configurable

2016-01-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081686#comment-15081686 ] Felix Cheung commented on SPARK-12609: -- Shouldn't we have a "in-progress" update ping or

[jira] [Commented] (SPARK-12625) SparkR is using deprecated APIs

2016-01-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081903#comment-15081903 ] Felix Cheung commented on SPARK-12625: -- Is this the complete set of API to be removed? I took a look

Re: how to use sparkR or spark MLlib load csv file on hdfs then calculate covariance

2015-12-28 Thread Felix Cheung
Make sure you add the csv spark package as this example here so that the source parameter in R read.df would work: https://spark.apache.org/docs/latest/sparkr.html#from-data-sources _ From: Andy Davidson Sent: Monday, December 28,

[jira] [Created] (SPARK-12534) Document missing command line options to Spark properties mapping

2015-12-27 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12534: Summary: Document missing command line options to Spark properties mapping Key: SPARK-12534 URL: https://issues.apache.org/jira/browse/SPARK-12534 Project: Spark

Re: number of executors in sparkR.init()

2015-12-25 Thread Felix Cheung
The equivalent for spark-submit --num-executors should be  spark.executor.instancesWhen use in SparkConf?http://spark.apache.org/docs/latest/running-on-yarn.html Could you try setting that with sparkR.init()? _ From: Franc Carter Sent:

Re: Do existing R packages work with SparkR data frames

2015-12-23 Thread Felix Cheung
Hi SparkR has some support for machine learning algorithm like glm. For existing R packages, currently you would need to collect to convert into R data.frame - assuming it fits into the memory of the driver node, though that would be required to work with R package in any case.

[jira] [Created] (SPARK-12515) Minor clarification on DataFrameReader.jdbc doc

2015-12-23 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12515: Summary: Minor clarification on DataFrameReader.jdbc doc Key: SPARK-12515 URL: https://issues.apache.org/jira/browse/SPARK-12515 Project: Spark Issue Type

[jira] [Commented] (SPARK-12327) lint-r checks fail with commented code

2015-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067575#comment-15067575 ] Felix Cheung commented on SPARK-12327: -- https://github.com/apache/spark/pull/10408 > lint-r che

[jira] [Commented] (SPARK-12360) Support using 64-bit long type in SparkR

2015-12-18 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065207#comment-15065207 ] Felix Cheung commented on SPARK-12360: -- +1 string. How would parse error (eg. should be int

[jira] [Commented] (SPARK-12327) lint-r checks fail with commented code

2015-12-15 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059510#comment-15059510 ] Felix Cheung commented on SPARK-12327: -- re: PR to lintr: https://github.com/jimhester/lintr/issues

[jira] [Commented] (SPARK-11255) R Test build should run on R 3.1.2

2015-12-15 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059515#comment-15059515 ] Felix Cheung commented on SPARK-11255: -- [~joshrosen] was there a specific change in R 3.1.2 that you

[jira] [Commented] (SPARK-10312) Enhance SerDe to handle atomic vector

2015-12-15 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059537#comment-15059537 ] Felix Cheung commented on SPARK-10312: -- I think method implementation could easily handle this. ie

[jira] [Commented] (SPARK-12172) Consider removing SparkR internal RDD APIs

2015-12-15 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059550#comment-15059550 ] Felix Cheung commented on SPARK-12172: -- That's a great point. [~shivaram] should https

[jira] [Comment Edited] (SPARK-12327) lint-r checks fail with commented code

2015-12-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056995#comment-15056995 ] Felix Cheung edited comment on SPARK-12327 at 12/14/15 11:51 PM: - "#

[jira] [Commented] (SPARK-12327) lint-r checks fail with commented code

2015-12-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056995#comment-15056995 ] Felix Cheung commented on SPARK-12327: -- "# void -> NULL" is another example - they

[jira] [Commented] (SPARK-12327) lint-r checks fail with commented code

2015-12-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056994#comment-15056994 ] Felix Cheung commented on SPARK-12327: -- yea, I think these are overactive checks in lint-r that I

[jira] [Commented] (SPARK-11255) R Test build should run on R 3.1.1

2015-12-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057368#comment-15057368 ] Felix Cheung commented on SPARK-11255: -- Is this test error caused by this? https

[jira] [Commented] (SPARK-12327) lint-r checks fail with commented code

2015-12-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057364#comment-15057364 ] Felix Cheung commented on SPARK-12327: -- In fact [~yu_ishikawa] did open a PR for a subset

[jira] [Updated] (SPARK-12232) Create new R API for read.table to avoid conflict

2015-12-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12232: - Summary: Create new R API for read.table to avoid conflict (was: Consider exporting read.table

[jira] [Commented] (SPARK-12232) Consider exporting read.table in R

2015-12-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054003#comment-15054003 ] Felix Cheung commented on SPARK-12232: -- agreed, `sqlTableToDF` would make sense. > Consi

[jira] [Commented] (SPARK-12235) Enhance mutate() to support replace existing columns

2015-12-09 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049217#comment-15049217 ] Felix Cheung commented on SPARK-12235: -- Please see https://github.com/apache/spark/pull/8503

[jira] [Commented] (SPARK-12232) Consider exporting read.table in R

2015-12-09 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049743#comment-15049743 ] Felix Cheung commented on SPARK-12232: -- table() is not on DataFrame, it's with SQLContext which

[jira] [Commented] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-12-09 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15050104#comment-15050104 ] Felix Cheung commented on SPARK-12071: -- Release note: http://spark.apache.org/releases/spark-release

Re: SparkR read.df failed to read file from local directory

2015-12-08 Thread Felix Cheung
Have you tried flightsDF <- read.df(sqlContext, "/home/myuser/test_data/sparkR/flights.csv", source = "com.databricks.spark.csv", header = "true")     _ From: Boyu Zhang Sent: Tuesday, December 8, 2015 8:47 AM Subject: SparkR read.df

[jira] [Commented] (SPARK-12224) R support for JDBC source

2015-12-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15047868#comment-15047868 ] Felix Cheung commented on SPARK-12224: -- I'm working on this. > R support for JDBC sou

[jira] [Created] (SPARK-12224) R support for JDBC source

2015-12-08 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12224: Summary: R support for JDBC source Key: SPARK-12224 URL: https://issues.apache.org/jira/browse/SPARK-12224 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-12232) Consider exporting read.table in R

2015-12-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12232: - Summary: Consider exporting read.table in R (was: Consider exporting in R read.table

[jira] [Comment Edited] (SPARK-12232) Consider exporting read.table in R

2015-12-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048039#comment-15048039 ] Felix Cheung edited comment on SPARK-12232 at 12/9/15 5:22 AM: --- WIP here

[jira] [Commented] (SPARK-12232) Consider exporting read.table in R

2015-12-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048039#comment-15048039 ] Felix Cheung commented on SPARK-12232: -- WIP here: https://github.com/felixcheung/spark/commit

[jira] [Commented] (SPARK-12232) Consider exporting read.table in R

2015-12-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048195#comment-15048195 ] Felix Cheung commented on SPARK-12232: -- right, but then table() is confusing as well. R's notion

[jira] [Created] (SPARK-12232) Consider exporting in R read.table

2015-12-08 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12232: Summary: Consider exporting in R read.table Key: SPARK-12232 URL: https://issues.apache.org/jira/browse/SPARK-12232 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-12235) Enhance mutate() to support replace existing columns

2015-12-08 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048204#comment-15048204 ] Felix Cheung commented on SPARK-12235: -- Is this related to https://issues.apache.org/jira/browse

[jira] [Updated] (SPARK-12172) Consider removing SparkR internal RDD APIs

2015-12-07 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12172: - Summary: Consider removing SparkR internal RDD APIs (was: Remove SparkR internal RDD APIs

[jira] [Commented] (SPARK-12172) Consider removing SparkR internal RDD APIs

2015-12-07 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046369#comment-15046369 ] Felix Cheung commented on SPARK-12172: -- Sure. What are the uses for RDD stuff you have in mind

[jira] [Created] (SPARK-12168) Need test for masked function

2015-12-06 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12168: Summary: Need test for masked function Key: SPARK-12168 URL: https://issues.apache.org/jira/browse/SPARK-12168 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-12172) Remove SparkR internal RDD APIs

2015-12-06 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12172: Summary: Remove SparkR internal RDD APIs Key: SPARK-12172 URL: https://issues.apache.org/jira/browse/SPARK-12172 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-12173) Consider supporting DataSet API in SparkR

2015-12-06 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12173: Summary: Consider supporting DataSet API in SparkR Key: SPARK-12173 URL: https://issues.apache.org/jira/browse/SPARK-12173 Project: Spark Issue Type: Sub

[jira] [Commented] (SPARK-12169) SparkR 2.0

2015-12-06 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044390#comment-15044390 ] Felix Cheung commented on SPARK-12169: -- Great thanks for opening this. I think we should definitely

[jira] [Updated] (SPARK-12168) Need test for conflicted function in R

2015-12-06 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12168: - Summary: Need test for conflicted function in R (was: Need test for masked function) > N

[jira] [Updated] (SPARK-12168) Need test for masked function

2015-12-06 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12168: - Description: Currently it is hard to know if a function in base or stats packages are masked

[jira] [Commented] (SPARK-12169) SparkR 2.0

2015-12-06 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044394#comment-15044394 ] Felix Cheung commented on SPARK-12169: -- For those who's reading - we shouldn't open PR yet. See

RE: Spark worker memory not freed up after zeppelin run finishes

2015-12-04 Thread Felix Cheung
@zeppelin.incubator.apache.org That is understandable, but what about if you stop execution by pressing button in notebook? If you do that after you cached some rdd or broadcasted a variable, the cleanup code won't be executed, right ? On Thu, Dec 3, 2015 at 6:25 PM, Felix Cheung <felixcheun...@hotmail.com> wrote:

[jira] [Commented] (SPARK-12144) Implement DataFrameReader and DataFrameWriter API in SparkR

2015-12-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15042103#comment-15042103 ] Felix Cheung commented on SPARK-12144: -- +1 [~shivaram] The style {code} read.format("json&quo

[jira] [Updated] (SPARK-12148) SparkR: rename DataFrame to SparkDataFrame

2015-12-04 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12148: - Component/s: SparkR > SparkR: rename DataFrame to SparkDataFr

Re: Where to deploy Zeppelin for Cloudera cluster

2015-12-04 Thread Felix Cheung
You should be able to setup a client only machine and assign spark and hive clients on that. On Fri, Dec 4, 2015 at 1:15 PM -0800, "Hoc Phan" wrote: When I setup Cloudera, there is no /hive dir in management node. I guess I had to add that role in Cloudera Manager

Re: zeppelin job is running all the time

2015-12-03 Thread Felix Cheung
the time until I exit from hive cml. No matter how many HQL I submit, it is still single job there. I think it is because of using Tez. Is Tez having conflict with zeppelin? On Dec 3, 2015, at 2:15 AM, Felix Cheung < felixcheun...

Re: Spark worker memory not freed up after zeppelin run finishes

2015-12-03 Thread Felix Cheung
variables in the end, am I correct? Because it hasn't > crashed since then, the following runs are always a little slower though. > > On Thu, Dec 3, 2015 at 8:08 AM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > >> How are you

Re: Spark worker memory not freed up after zeppelin run finishes

2015-12-03 Thread Felix Cheung
So far it seems it stopped after I started destroying them + cachedRdd.unpersist On Thu, Dec 3, 2015 at 5:52 PM, Felix Cheung <felixcheun...@hotmail.com> wrote: > Do you know what version of spark you are running with? > > > > > > On Thu, Dec 3, 2015 at 12:52 AM -0800, &q

Re: Python API Documentation Mismatch

2015-12-03 Thread Felix Cheung
Please open an issue in JIRA, thanks! On Thu, Dec 3, 2015 at 3:03 AM -0800, "Roberto Pagliari" wrote: Hello, I believe there is a mismatch between the API documentation (1.5.2) and the software currently available. Not all functions mentioned here

[jira] [Updated] (SPARK-12019) SparkR.init does not support character vector for sparkJars and sparkPackages

2015-12-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12019: - Affects Version/s: 1.4.0 1.4.1 > SparkR.init does not support charac

Re: zeppelin job is running all the time

2015-12-03 Thread Felix Cheung
lt 1000zeppelin.spark.useHiveContext true On Dec 3, 2015, at 11:51 AM, Felix Cheung < felixcheun...@hotmail.com> wrote: Could you send us your configuration of the Spark Inter

Re: SparkR in Spark 1.5.2 jsonFile Bug Found

2015-12-03 Thread Felix Cheung
It looks like this has been broken around Spark 1.5. Please see JIRA SPARK-10185. This has been fixed in pyspark but unfortunately SparkR was missed. I have confirmed this is still broken in Spark 1.6. Could you please open a JIRA? On Thu, Dec 3, 2015 at 2:08 PM -0800, "tomasr3"

RE: Spark worker memory not freed up after zeppelin run finishes

2015-12-02 Thread Felix Cheung
How are you running jobs? Do you schedule a notebook to run from Zeppelin? Date: Mon, 30 Nov 2015 12:42:16 +0100 Subject: Spark worker memory not freed up after zeppelin run finishes From: liska.ja...@gmail.com To: users@zeppelin.incubator.apache.org Hey, I'm connecting Zeppelin with a remote

[jira] [Updated] (SPARK-12019) SparkR.init does not support character vector for sparkJars and sparkPackages

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12019: - Summary: SparkR.init does not support character vector for sparkJars and sparkPackages

[jira] [Commented] (SPARK-12019) SparkR.init does not support character vector for sparkJars and sparkPackages

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037221#comment-15037221 ] Felix Cheung commented on SPARK-12019: -- per PR comment, we are going to update the code to support

RE: zeppelin job is running all the time

2015-12-02 Thread Felix Cheung
I don't know enough about HDP, but there should be a way to check user queue in YARN? Spark job shouldn't affect Hive job though. Have you tried running spark-shell (--master yarn-client) and a Hive job at the same time? From: will...@gmail.com Subject: Re: zeppelin job is running all the time

[jira] [Commented] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037249#comment-15037249 ] Felix Cheung commented on SPARK-12071: -- Also see discussion in https://github.com/apache/spark/pull

[jira] [Updated] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12071: - Labels: releasenotes (was: ) > Programming guide should explain NULL in JVM translate to

[jira] [Commented] (SPARK-11886) R function name conflicts with base or stats package ones

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037251#comment-15037251 ] Felix Cheung commented on SPARK-11886: -- Created SPARK-12116 for dplyr conflicts > R function n

[jira] [Created] (SPARK-12116) Document workaround when method conflicts with another R package, like dplyr

2015-12-02 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12116: Summary: Document workaround when method conflicts with another R package, like dplyr Key: SPARK-12116 URL: https://issues.apache.org/jira/browse/SPARK-12116 Project

[jira] [Comment Edited] (SPARK-11886) R function name conflicts with base or stats package ones

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037251#comment-15037251 ] Felix Cheung edited comment on SPARK-11886 at 12/3/15 4:51 AM: --- Created

[jira] [Updated] (SPARK-12116) Document workaround when method conflicts with another R package, like dplyr

2015-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12116: - Description: See https://issues.apache.org/jira/browse/SPARK-11886 > Document workaround w

[jira] [Commented] (SPARK-11886) R function name conflicts with base or stats package ones

2015-12-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034700#comment-15034700 ] Felix Cheung commented on SPARK-11886: -- {code} > library(dplyr) > library(SparkR, lib.loc

[jira] [Commented] (SPARK-10894) Add 'drop' support for DataFrame's subset function

2015-12-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035090#comment-15035090 ] Felix Cheung commented on SPARK-10894: -- These seem like orthogonal things. I don't know

[jira] [Commented] (SPARK-11886) R function name conflicts with base or stats package ones

2015-11-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032810#comment-15032810 ] Felix Cheung commented on SPARK-11886: -- I see this if I load dplyr after SparkR {code} > libr

[jira] [Created] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-11-30 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-12071: Summary: Programming guide should explain NULL in JVM translate to NA in R Key: SPARK-12071 URL: https://issues.apache.org/jira/browse/SPARK-12071 Project: Spark

[jira] [Commented] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-11-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033227#comment-15033227 ] Felix Cheung commented on SPARK-12071: -- See PR https://github.com/apache/spark/commit

[jira] [Comment Edited] (SPARK-12071) Programming guide should explain NULL in JVM translate to NA in R

2015-11-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033227#comment-15033227 ] Felix Cheung edited comment on SPARK-12071 at 12/1/15 7:07 AM: --- See commit

RE: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-29 Thread Felix Cheung
--executor-memory 2G \ $extraPkgs \ $* From: Felix Cheung <felixcheun...@hotmail.com> Date: Saturday, November 28, 2015 at 12:11 AM To: Ted Yu <yuzhih...@gmail.com> Cc: Andrew Davidson <a...@santacruzintegration.com>, "user @spark" <user@spark.apache.org>

[jira] [Updated] (SPARK-12019) SparkR.init have wrong example

2015-11-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-12019: - Component/s: SparkR > SparkR.init have wrong exam

Re: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-28 Thread Felix Cheung
Ah, it's there in spark-submit and pyspark.Seems like it should be added for spark_ec2 _ From: Ted Yu <yuzhih...@gmail.com> Sent: Friday, November 27, 2015 11:50 AM Subject: Re: possible bug spark/python/pyspark/rdd.py portable_hash() To: Felix

RE: [DISCUSS]Strict code style and PR guide

2015-11-27 Thread Felix Cheung
+1, +1 also on discussing Coding guideline / Coding style on http://flink.apache.org/contribute-code.html - think we should adopt it as applicable to this community. > From: m...@apache.org > Date: Fri, 27 Nov 2015 08:23:53 + > Subject: Re: [DISCUSS]Strict code style and PR guide > To:

<    18   19   20   21   22   23   24   25   >