Re: welcoming Burak and Holden as committers

2017-01-24 Thread Chester Chen
Congratulation to both. Holden, we need catch up. Chester Chen ■ Senior Manager – Data Science & Engineering 3000 Clearview Way San Mateo, CA 94402 [cid:image001.png@01D27678.9466E4D0] From: Felix Cheung <felixcheun...@hotmail.com> Date: Tuesday, January 24, 2017 at 1:20 PM To: R

Re: [discuss] DataFrame vs Dataset in Spark 2.0

2016-02-25 Thread Chester Chen
vote for Option 1. 1) Since 2.0 is major API, we are expecting some API changes, 2) It helps long term code base maintenance with short term pain on Java side 3) Not quite sure how large the code base is using Java DataFrame APIs. On Thu, Feb 25, 2016 at 3:23 PM, Reynold Xin

Re: Dropping support for earlier Hadoop versions in Spark 2.0?

2015-11-20 Thread Chester Chen
for #1-3, the answer is likely No. Recently we upgrade to Spark 1.5.1, with CDH5.3, CDH5.4 and HDP2.2 and others. We were using CDH5.3 client to talk to CDH5.4. We were doing this to see if we support many different hadoop cluster versions without changing the build. This was ok for

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Chester Chen
+1 Test against CDH5.4.2 with hadoop 2.6.0 version using yesterday's code, build locally. Regression running in Yarn Cluster mode against few internal ML ( logistic regression, linear regression, random forest and statistic summary) as well Mlib KMeans. all seems to work fine. Chester On Tue,

Re: Possible bug on Spark Yarn Client (1.5.1) during kerberos mode ?

2015-10-22 Thread Chester Chen
e/SPARK-9042 > > By changing how Hive Context instance is created, this issue might also be > resolved. > > On Thu, Oct 22, 2015 at 11:33 AM Steve Loughran <ste...@hortonworks.com> > wrote: > >> On 22 Oct 2015, at 08:25, Chester Chen <ches...@alpinenow.com> wrote: >>

Re: Possible bug on Spark Yarn Client (1.5.1) during kerberos mode ?

2015-10-22 Thread Chester Chen
Thanks for the ticket. Chester On Thu, Oct 22, 2015 at 1:15 PM, Steve Loughran <ste...@hortonworks.com> wrote: > > On 22 Oct 2015, at 19:32, Chester Chen <ches...@alpinenow.com> wrote: > > Steven > You summarized mostly correct. But there is a couple p

Possible bug on Spark Yarn Client (1.5.1) during kerberos mode ?

2015-10-21 Thread Chester Chen
All, just to see if this happens to other as well. This is tested against the spark 1.5.1 ( branch 1.5 with label 1.5.2-SNAPSHOT with commit on Tue Oct 6, 84f510c4fa06e43bd35e2dc8e1008d0590cbe266) Spark deployment mode : Spark-Cluster Notice that if we enable Kerberos mode,

Re: Possible bug on Spark Yarn Client (1.5.1) during kerberos mode ?

2015-10-21 Thread Chester Chen
hadoop cluster). The job submission actually failed in the client side. Currently we get around this by replace the spark's hive-exec with apache hive-exec. Chester On Wed, Oct 21, 2015 at 5:27 PM, Doug Balog <d...@balog.net> wrote: > See comments below. > > > On Oct 21,

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread Chester Chen
.1 snapshot) ? > >>> > >>> > >>> > >>> Sent from my iPad > >>> > >>>> On Sep 1, 2015, at 1:52 AM, Sean Owen <so...@cloudera.com> wrote: > >>>> > >>>> That's correct for the 1.5 branch, right?

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-31 Thread Chester Chen
Seems that Github branch-1.5 already changing the version to 1.5.1-SNAPSHOT, I am a bit confused are we still on 1.5.0 RC3 or we are in 1.5.1 ? Chester On Mon, Aug 31, 2015 at 3:52 PM, Reynold Xin wrote: > I'm going to -1 the release myself since the issue @yhuai

Re: High Availability of Spark Driver

2015-08-28 Thread Chester Chen
Ashish and Steve I am also working on the long running Yarn Spark Job. Just start to focus on failure recovery. This thread of discussion is really helpful. Chester On Fri, Aug 28, 2015 at 12:53 AM, Ashish Rawat ashish.ra...@guavus.com wrote: Thanks Steve. I had not spent many brain

Re: Welcoming some new committers

2015-06-17 Thread Chester Chen
Congratulations to All. DB and Sandy, great works ! On Wed, Jun 17, 2015 at 3:12 PM, Matei Zaharia matei.zaha...@gmail.com wrote: Hey all, Over the past 1.5 months we added a number of new committers to the project, and I wanted to welcome them now that all of their respective forms,

Re: Change for submitting to yarn in 1.3.1

2015-05-25 Thread Chester Chen
I put the design requirements and description in the commit comment. So I will close the PR. please refer the following commit https://github.com/AlpineNow/spark/commit/5b336bbfe92eabca7f4c20e5d49e51bb3721da4d On Mon, May 25, 2015 at 3:21 PM, Chester Chen ches...@alpinenow.com wrote: All

Re: Change for submitting to yarn in 1.3.1

2015-05-25 Thread Chester Chen
All, I have created a PR just for the purpose of helping document the use case, requirements and design. As it is unlikely to get merge in. So it only used to illustrate the problems we trying and solve and approaches we took. https://github.com/apache/spark/pull/6398 Hope this

Re: Submit Kill Spark Application program programmatically from another application

2015-05-03 Thread Chester Chen
Sounds like you are in Yarn-Cluster mode. I created a JIRA SPARK-3913 https://issues.apache.org/jira/browse/SPARK-3913 and PR https://github.com/apache/spark/pull/2786 is this what you looking for ? Chester On Sat, May 2, 2015 at 10:32 PM, Yijie Shen henry.yijies...@gmail.com wrote: Hi,

Question regarding some of the changes in [SPARK-3477]

2015-04-14 Thread Chester Chen
While working on upgrading to Spark 1.3.x, notice that the Client and ClientArgument classes in yarn module are now defined as private[spark]. I know that these code are mostly used by spark-submit code; but we call Yarn client directly ( without going through spark-submit) in our spark

Re: broadcast hang out

2015-03-15 Thread Chester Chen
can you just replace Duration.Inf with a shorter duration ? how about import scala.concurrent.duration._ val timeout = new Timeout(10 seconds) Await.result(result.future, timeout.duration) or val timeout = new FiniteDuration(10, TimeUnit.SECONDS)

FYI: Prof John Canny is giving a talk on Machine Learning at the limit in SF Big Analytics Meetup

2015-02-10 Thread Chester Chen
Just in case you are in San Francisco, we are having a meetup by Prof John Canny http://www.meetup.com/SF-Big-Analytics/events/220427049/ Chester

Re: Unit testing Master-Worker Message Passing

2014-10-15 Thread Chester Chen
You can call resolve method on ActorSelection.resolveOne() to see if the actor is still there or the path is correct. The method returns a future and you can wait for it with timeout. This way, you know the actor is live or already dead or incorrect. Another way, is to send Identify method to

Re: RFC: Deprecating YARN-alpha API's

2014-09-09 Thread Chester Chen
We were using it until recently, we are talking to our customers and see if we can get off it. Chester Alpine Data Labs On Tue, Sep 9, 2014 at 10:59 AM, Sean Owen so...@cloudera.com wrote: FWIW consensus from Cloudera folk seems to be that there's no need or demand on this end for YARN

is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-20 Thread Chester Chen
I just updated today's build and tried branch-1.1 for both yarn and yarn-alpha. For yarn build, this command seem to work fine. sbt/sbt -Pyarn -Dhadoop.version=2.3.0-cdh5.0.1 projects for yarn-alpha sbt/sbt -Pyarn-alpha -Dhadoop.version=2.0.5-alpha projects I got the following Any ideas

Re: is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-20 Thread Chester Chen
Just tried on master branch, and the master branch works fine for yarn-alpha On Wed, Aug 20, 2014 at 4:39 PM, Chester Chen ches...@alpinenow.com wrote: I just updated today's build and tried branch-1.1 for both yarn and yarn-alpha. For yarn build, this command seem to work fine. sbt/sbt

Re: Master compilation with sbt

2014-07-19 Thread Chester Chen
Works for me as well: git branch branch-0.9 branch-1.0 * master Chesters-MacBook-Pro:spark chester$ git pull --rebase remote: Counting objects: 578, done. remote: Compressing objects: 100% (369/369), done. remote: Total 578 (delta 122), reused 418 (delta 71) Receiving objects: 100%

Re: Possible bug in ClientBase.scala?

2014-07-17 Thread Chester Chen
only see compile errors in yarn-stable, and you are trying to compile vs YARN alpha versions no? On Thu, Jul 17, 2014 at 5:39 AM, Chester Chen ches...@alpinenow.com wrote: Looking further, the yarn and yarn-stable are both for the stable version of Yarn, that explains the compilation

Re: Possible bug in ClientBase.scala?

2014-07-17 Thread Chester Chen
explicitly. In fact I think you can just call to ClientBase for this? PR it, I say. On Thu, Jul 17, 2014 at 3:24 PM, Chester Chen ches...@alpinenow.com wrote: val knownDefMRAppCP: Seq[String] = getFieldValue[String, Seq[String]](classOf[MRJobConfig

Re: Possible bug in ClientBase.scala?

2014-07-16 Thread Chester Chen
checked and this bug is fixed in recent releases of Spark. -Sandy On Sun, Jul 13, 2014 at 8:15 PM, Chester Chen ches...@alpinenow.com wrote: Ron, Which distribution and Version of Hadoop are you using ? I just looked at CDH5 ( hadoop-mapreduce-client-core- 2.3.0-cdh5.0.0

Re: Possible bug in ClientBase.scala?

2014-07-16 Thread Chester Chen
Hmm looks like a Build script issue: I run the command with : sbt/sbt clean *yarn/*test:compile but errors came from [error] 40 errors found [error] (*yarn-stable*/compile:compile) Compilation failed Chester On Wed, Jul 16, 2014 at 5:18 PM, Chester Chen ches...@alpinenow.com wrote: Hi

Re: Possible bug in ClientBase.scala?

2014-07-16 Thread Chester Chen
]streaming-kafka [info]streaming-mqtt [info]streaming-twitter [info]streaming-zeromq [info]tools [info]yarn [info] * yarn-stable On Wed, Jul 16, 2014 at 5:41 PM, Chester Chen ches...@alpinenow.com wrote: Hmm looks like a Build script issue: I run the command with : sbt

Re: Application level progress monitoring and communication

2014-06-30 Thread Chester Chen
it with a lot of different ways, such as Akka, custom REST API, Thrift ... I think any of them will do. On Sun, Jun 29, 2014 at 7:57 PM, Chester Chen ches...@alpinenow.com wrote: Hi Spark dev community: I have several questions regarding Application and Spark communication 1

Re: spark config params conventions

2014-03-14 Thread Chester Chen
Based on typesafe config maintainer's response, with latest version of typeconfig, the double quote is no longer needed for key like spark.speculation, so you don't need code to strip the quotes Chester Alpine data labs Sent from my iPhone On Mar 12, 2014, at 2:50 PM, Aaron Davidson