Re: more uniform exception handling?

2016-04-18 Thread Evan Chan
+1000. Especially if the UI can help correlate exceptions, and we can reduce some exceptions. There are some exceptions which are in practice very common, such as the nasty ClassNotFoundException, that most folks end up spending tons of time debugging. On Mon, Apr 18, 2016 at 12:16 PM, Reynold

Re: Using local-cluster mode for testing Spark-related projects

2016-04-17 Thread Evan Chan
ith a `local-cluster` mode by > yourself like > 'https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/ShuffleSuite.scala#L55'? > > // maropu > > On Sun, Apr 17, 2016 at 9:47 AM, Evan Chan <velvia.git...@gmail.com> wrote: >> >> Hey folks

Using local-cluster mode for testing Spark-related projects

2016-04-16 Thread Evan Chan
Hey folks, I'd like to use local-cluster mode in my Spark-related projects to test Spark functionality in an automated way in a simulated local cluster.The idea is to test multi-process things in a much easier fashion than setting up a real cluster. However, getting this up and running in a

Re: Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends

2016-04-16 Thread Evan Chan
Hi folks, Sorry to join the discussion late. I had a look at the design doc earlier in this thread, and it was not mentioned what types of projects are the targets of this new "spark extras" ASF umbrella Is the desire to have a maintained set of spark-related projects that keep pace with

Spark Summit CFP - Tracks guidelines

2015-02-04 Thread Evan Chan
Hey guys, Is there any guidance on what the different tracks for Spark Summit West mean? There are some new ones, like Third Party Apps, which seems like it would be similar to the Use Cases. Any further guidance would be great. thanks, Evan

Re: Welcoming three new committers

2015-02-03 Thread Evan Chan
Congrats everyone!!! On Tue, Feb 3, 2015 at 3:17 PM, Timothy Chen tnac...@gmail.com wrote: Congrats all! Tim On Feb 4, 2015, at 7:10 AM, Pritish Nawlakhe prit...@nirvana-international.com wrote: Congrats and welcome back!! Thank you!! Regards Pritish Nirvana International Inc.

Re: SparkSubmit.scala and stderr

2015-02-03 Thread Evan Chan
Why not just use SLF4J? On Tue, Feb 3, 2015 at 2:22 PM, Reynold Xin r...@databricks.com wrote: We can use ScalaTest's privateMethodTester also instead of exposing that. On Tue, Feb 3, 2015 at 2:18 PM, Marcelo Vanzin van...@cloudera.com wrote: Hi Jay, On Tue, Feb 3, 2015 at 6:28 AM,

Re: renaming SchemaRDD - DataFrame

2015-02-01 Thread Evan Chan
can i find that? thanks On Thu, Jan 29, 2015 at 2:32 PM, Evan Chan velvia.git...@gmail.com wrote: +1 having proper NA support is much cleaner than using null, at least the Java null. On Wed, Jan 28, 2015 at 6:10 PM, Evan R. Sparks evan.spa...@gmail.com wrote: You've got

Re: renaming SchemaRDD - DataFrame

2015-01-29 Thread Evan Chan
. See, e.g. http://www.r-bloggers.com/r-na-vs-null/ On Wed, Jan 28, 2015 at 4:42 PM, Reynold Xin r...@databricks.com wrote: Isn't that just null in SQL? On Wed, Jan 28, 2015 at 4:41 PM, Evan Chan velvia.git...@gmail.com wrote: I believe that most DataFrame implementations out there, like

Re: renaming SchemaRDD - DataFrame

2015-01-28 Thread Evan Chan
, sql.catalyst is hidden from users, and all public APIs have first class classes/objects defined in sql directly. On Wed, Jan 28, 2015 at 4:20 PM, Evan Chan velvia.git...@gmail.com wrote: Hey guys, How does this impact the data sources API? I was planning on using this for a project. +1

Re: renaming SchemaRDD - DataFrame

2015-01-28 Thread Evan Chan
Hey guys, How does this impact the data sources API? I was planning on using this for a project. +1 that many things from spark-sql / DataFrame is universally desirable and useful. By the way, one thing that prevents the columnar compression stuff in Spark SQL from being more useful is, at

Re: renaming SchemaRDD - DataFrame

2015-01-28 Thread Evan Chan
: Isn't that just null in SQL? On Wed, Jan 28, 2015 at 4:41 PM, Evan Chan velvia.git...@gmail.com wrote: I believe that most DataFrame implementations out there, like Pandas, supports the idea of missing values / NA, and some support the idea of Not Meaningful as well. Does Row support anything

Re: Multitenancy in Spark - within/across spark context

2014-10-23 Thread Evan Chan
Ashwin, I would say the strategies in general are: 1) Have each user submit separate Spark app (each its own Spark Context), with its own resource settings, and share data through HDFS or something like Tachyon for speed. 2) Share a single spark context amongst multiple users, using fair

Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread Evan Chan
James, Michael at the meetup last night said there was some development activity around ORCFiles. I'm curious though, what are the pros and cons of ORCFiles vs Parquet? On Wed, Oct 8, 2014 at 10:03 AM, James Yu jym2...@gmail.com wrote: Didn't see anyone asked the question before, but I was

Re: [Spark SQL] off-heap columnar store

2014-09-02 Thread Evan Chan
/Vertica/etc. and write back into Parquet, but this would take a long time and incur huge I/O overhead. I'm sorry it just sounds like its worth clearly defining what your key requirement/goal is. On Thu, Aug 28, 2014 at 11:31 PM, Evan Chan velvia.git...@gmail.com wrote: The reason I'm

Re: [Spark SQL] off-heap columnar store

2014-08-29 Thread Evan Chan
The reason I'm asking about the columnar compressed format is that there are some problems for which Parquet is not practical. Can you elaborate? Sure. - Organization or co has no Hadoop, but significant investment in some other NoSQL store. - Need to efficiently add a new column to

Re: [Spark SQL] off-heap columnar store

2014-08-26 Thread Evan Chan
What would be the timeline for the parquet caching work? The reason I'm asking about the columnar compressed format is that there are some problems for which Parquet is not practical. On Mon, Aug 25, 2014 at 1:13 PM, Michael Armbrust mich...@databricks.com wrote: What is the plan for getting

[Spark SQL] off-heap columnar store

2014-08-22 Thread Evan Chan
Hey guys, What is the plan for getting Tachyon/off-heap support for the columnar compressed store? It's not in 1.1 is it? In particular: - being able to set TACHYON as the caching mode - loading of hot columns or all columns - write-through of columnar store data to HDFS or backing store -

Re: sbt-package-bin

2014-04-02 Thread Evan Chan
into a single lib/ folder, so in some ways it's even easier to manage than the assembly. You might also check out the sbt-native-packagerhttps://github.com/sbt/sbt-native-packager. Cheers, Lee -- -- Evan Chan Staff Engineer e...@ooyala.com | http://www.ooyala.com/ http://www.facebook.com

Would anyone mind having a quick look at PR#288?

2014-04-02 Thread Evan Chan
https://github.com/apache/spark/pull/288 It's for fixing SPARK-1154, which would help Spark be a better citizen for most deploys, and should be really small and easy to review. thanks, Evan -- -- Evan Chan Staff Engineer e...@ooyala.com | http://www.ooyala.com/ http://www.facebook.com

sbt-package-bin

2014-04-01 Thread Evan Chan
/ folder, so in some ways it's even easier to manage than the assembly. Also I'm not sure if there's an equivalent plugin for Maven. thanks, Evan -- -- Evan Chan Staff Engineer e...@ooyala.com | http://www.ooyala.com/ http://www.facebook.com/ooyalahttp://www.linkedin.com/company/ooyalahttp

Re: sbt-package-bin

2014-04-01 Thread Evan Chan
Also, I understand this is the last week / merge window for 1.0, so if folks are interested I'd like to get in a PR quickly. thanks, Evan On Tue, Apr 1, 2014 at 11:24 AM, Evan Chan e...@ooyala.com wrote: Hey folks, We are in the middle of creating a Chef recipe for Spark. As part

Re: sbt-package-bin

2014-04-01 Thread Evan Chan
... On Tue, Apr 1, 2014 at 11:24 AM, Evan Chan e...@ooyala.com wrote: Also, I understand this is the last week / merge window for 1.0, so if folks are interested I'd like to get in a PR quickly. thanks, Evan On Tue, Apr 1, 2014 at 11:24 AM, Evan Chan e...@ooyala.com wrote

Re: [DISCUSS] Shepherding PRs

2014-03-27 Thread Evan Chan
Niklas as a committer in collaboration with Ian. Hope that makes sense and don't hesitate to tell me that this was not the right way to achieve shepherding. cheers! Till -- -- Evan Chan Staff Engineer e...@ooyala.com | http://www.ooyala.com/ http://www.facebook.com/ooyalahttp

Re: Announcing the official Spark Job Server repo

2014-03-24 Thread Evan Chan
this point to deploy using marathon (should be planned for April) greetz and again, Nice Work Evan! Ndi On Wed, Mar 19, 2014 at 7:27 AM, Evan Chan e...@ooyala.com wrote: Andy, Yeah, we've thought of deploying this on Marathon ourselves, but we're not sure how much Mesos we're going to use yet

Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
if there are fixes that were not backported but you would like to see them in 0.9.1. Thanks! TD -- -- Evan Chan Staff Engineer e...@ooyala.com |

Re: new Catalyst/SQL component merged into master

2014-03-24 Thread Evan Chan
the new backend for Shark. -- -- Evan Chan Staff Engineer e...@ooyala.com |

Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
the PR and we can merge it branch-0.9. If we have to cut another release, then we can include it. On Sun, Mar 23, 2014 at 11:42 PM, Evan Chan e...@ooyala.com wrote: I also have a really minor fix for SPARK-1057 (upgrading fastutil), could that also make it in? -Evan On Sun, Mar 23

Re: spark jobserver

2014-03-24 Thread Evan Chan
spark-contrib. On Sat, Mar 22, 2014 at 6:15 PM, Suhas Satish suhas.sat...@gmail.com wrote: Any plans of integrating SPARK-818 into spark trunk ? The pull request is open. It offers spark as a service with spark jobserver running as a separate process. Thanks, Suhas. -- -- Evan Chan Staff

Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
dependency graph... -- -- Evan Chan Staff Engineer e...@ooyala.com |

Re: spark jobserver

2014-03-24 Thread Evan Chan
Suhas, You're welcome. We are planning to speak about the job server at the Spark Summit by the way. -Evan On Mon, Mar 24, 2014 at 9:38 AM, Suhas Satish suhas.sat...@gmail.com wrote: Thanks a lot for this update Evan , really appreciate the effort. On Monday, March 24, 2014, Evan Chan e

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Evan Chan
On Tue, Mar 18, 2014 at 11:39 PM, Henry Saputra henry.sapu...@gmail.comwrote: W00t! Thanks for releasing this, Evan. - Henry On Tue, Mar 18, 2014 at 1:51 PM, Evan Chan e...@ooyala.com wrote: Dear Spark developers, Ooyala is happy to announce that we have pushed our official, Spark

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Evan Chan
/Powered+By+Spark. Matei On Mar 18, 2014, at 1:51 PM, Evan Chan e...@ooyala.com wrote: Dear Spark developers, Ooyala is happy to announce that we have pushed our official, Spark 0.9.0 / Scala 2.10-compatible, job server as a github repo: https://github.com/ooyala/spark-jobserver Complete

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Evan Chan
repo set-up for the 1.0 release. On Tue, Mar 18, 2014 at 11:28 PM, Evan Chan e...@ooyala.com wrote: Matei, Maybe it's time to explore the spark-contrib idea again? Should I start a JIRA ticket? -Evan On Tue, Mar 18, 2014 at 4:04 PM, Matei Zaharia matei.zaha...@gmail.com wrote

Re: repositories for spark jars

2014-03-19 Thread Evan Chan
x 238 Email: nkronenf...@oculusinfo.com -- -- Evan Chan Staff Engineer e...@ooyala.com |

Announcing the official Spark Job Server repo

2014-03-18 Thread Evan Chan
is now closed. Please have a look; pull requests are very welcome. -- -- Evan Chan Staff Engineer e...@ooyala.com |

Spark 0.9.0 and log4j

2014-03-07 Thread Evan Chan
-- -- Evan Chan Staff Engineer e...@ooyala.com |

Spark JIRA

2014-02-28 Thread Evan Chan
Hey guys, There is no plan to move the Spark JIRA from the current https://spark-project.atlassian.net/ right? -- -- Evan Chan Staff Engineer e...@ooyala.com |

Re: Spark JIRA

2014-02-28 Thread Evan Chan
Best, -- Nan Zhu On Friday, February 28, 2014 at 2:29 PM, Evan Chan wrote: Hey guys, There is no plan to move the Spark JIRA from the current https://spark-project.atlassian.net/ right? -- -- Evan Chan Staff Engineer e...@ooyala.com (mailto:e...@ooyala.com

New JIRA ticket: cleaning up app-* folders

2014-02-28 Thread Evan Chan
a cron job to clean up old folders. thanks, -Evan -- -- Evan Chan Staff Engineer e...@ooyala.com |

New blog post on Spark + Parquet + Scrooge

2014-02-28 Thread Evan Chan
-unsubscr...@spark.apache.org But if you send a email to the digest-subscribe it bounces back with a help email. -- -- Evan Chan Staff Engineer e...@ooyala.com |

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Evan Chan
not completely obvious to me how to proceed with what sbt-pom-reader produces in order build the assemblies, run the test suites, etc., so I'm wondering if you have already worked out what that requires? On Wed, Feb 26, 2014 at 9:31 AM, Evan Chan e...@ooyala.com wrote: I'd like to propose the following

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Evan Chan
challenging.) On Wed, Feb 26, 2014 at 11:34 AM, Evan Chan e...@ooyala.com wrote: Mark, No, I haven't tried this myself yet :-p Also I would expect that sbt-pom-reader does not do assemblies at all because that is an SBT plugin, so we would still need code to include sbt-assembly

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-25 Thread Evan chan
to the *spark jar* using this plug-in (e.g. not an uber jar)? That's something I could see being really handy in the future. - Patrick On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan e...@ooyala.com wrote: The problem is that plugins are not equivalent. There is AFAIK no equivalent to the maven shader