Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Nan Zhu
just curious what happened on google’s spark operator? On Thu, Nov 9, 2023 at 19:12 Ilan Filonenko wrote: > +1 > > On Thu, Nov 9, 2023 at 7:43 PM Ryan Blue wrote: > >> +1 >> >> On Thu, Nov 9, 2023 at 4:23 PM Hussein Awala wrote: >> >>> +1 for creating an official Kubernetes operator for

Re: ASF policy violation and Scala version issues

2023-06-07 Thread Nan Zhu
for EMR, I think they show 3.1.2-amazon in Spark UI, no? On Wed, Jun 7, 2023 at 11:30 Grisha Weintraub wrote: > Hi, > > I am not taking sides here, but just for fairness, I think it should be > noted that AWS EMR does exactly the same thing. > We choose the EMR version (e.g., 6.4.0) and it

Re: Spark 2.4.5 release for Parquet and Avro dependency updates?

2019-11-22 Thread Nan Zhu
I am not sure if it is a good practice to have breaking changes in dependencies for maintenance releases On Fri, Nov 22, 2019 at 8:56 AM Michael Heuer wrote: > Hello, > > Avro 1.8.2 to 1.9.1 is a binary incompatible update, and it appears that > Parquet 1.10.1 to 1.11 will be a

Re: Time to cut an Apache 2.4.1 release?

2019-02-12 Thread Nan Zhu
just filed a JIRA in https://issues.apache.org/jira/browse/SPARK-26862 ' this issue only happens in 2.4.0 but not in 2.3.2 anyone would help to look into that? On Tue, Feb 12, 2019 at 10:41 AM DB Tsai wrote: > Great. I'll prepare the release for voting. Thanks! > > DB Tsai | Siri Open

Re: Integrating ML/DL frameworks with Spark

2018-05-08 Thread Nan Zhu
.how I skipped the last part On Tue, May 8, 2018 at 11:16 AM, Reynold Xin <r...@databricks.com> wrote: > Yes, Nan, totally agree. To be on the same page, that's exactly what I > wrote wasn't it? > > On Tue, May 8, 2018 at 11:14 AM Nan Zhu <zhunanmcg...@gmail.com

Re: Integrating ML/DL frameworks with Spark

2018-05-08 Thread Nan Zhu
besides that, one of the things which is needed by multiple frameworks is to schedule tasks in a single wave i.e. if some frameworks like xgboost/mxnet requires 50 parallel workers, Spark is desired to provide a capability to ensure that either we run 50 tasks at once, or we should quit the

Re: [VOTE] Spark 2.3.0 (RC5)

2018-02-26 Thread Nan Zhu
+1 (non-binding), tested with internal workloads and benchmarks On Mon, Feb 26, 2018 at 12:09 PM, Michael Armbrust wrote: > +1 all our pipelines have been running the RC for several days now. > > On Mon, Feb 26, 2018 at 10:33 AM, Dongjoon Hyun

Re: Palantir replease under org.apache.spark?

2018-01-09 Thread Nan Zhu
nvm On Tue, Jan 9, 2018 at 9:42 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote: > Hi, all > > Out of curious, I just found a bunch of Palantir release under > org.apache.spark in maven central (https://mvnrepository.com/ > artifact/org.apache.spark/spark-core_2.11)? > >

Palantir replease under org.apache.spark?

2018-01-09 Thread Nan Zhu
Hi, all Out of curious, I just found a bunch of Palantir release under org.apache.spark in maven central ( https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11)? Is it on purpose? Best, Nan

Request for review of SPARK-22599

2017-11-29 Thread Nan Zhu
Hi, all When we do perf test for Spark, we found that enabling table cache does not bring the expected speedup comparing to cloud-storage + parquet in many scenarios. We identified that the performance cost is brought by the fact that the current InMemoryRelation/InMemorytTableScanExec will

Re: Outstanding Spark 2.1.1 issues

2017-03-20 Thread Nan Zhu
I think https://issues.apache.org/jira/browse/SPARK-19280 should be a blocker Best, Nan On Mon, Mar 20, 2017 at 8:18 PM, Felix Cheung wrote: > I've been scrubbing R and think we are tracking 2 issues > > https://issues.apache.org/jira/browse/SPARK-19237 > >

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Nan Zhu
Congratulations! On Tue, Jan 24, 2017 at 4:50 PM, Hyukjin Kwon wrote: > Congratuation!! > > 2017-01-25 9:22 GMT+09:00 Takeshi Yamamuro : > >> Congrats! >> >> // maropu >> >> On Wed, Jan 25, 2017 at 9:20 AM, Kousuke Saruta < >>

Re: Welcoming Yanbo Liang as a committer

2016-06-03 Thread Nan Zhu
Congratulations ! --  Nan Zhu On June 3, 2016 at 10:50:33 PM, Ted Yu (yuzhih...@gmail.com) wrote: Congratulations, Yanbo. On Fri, Jun 3, 2016 at 7:48 PM, Matei Zaharia <matei.zaha...@gmail.com> wrote: Hi all, The PMC recently voted to add Yanbo Liang as a committer. Yanbo has been a

Release Announcement: XGBoost4J - Portable Distributed XGBoost in Spark, Flink and Dataflow

2016-03-15 Thread Nan Zhu
! For more details of distributed XGBoost, you can refer to the recently published paper: http://arxiv.org/abs/1603.02754 Best, -- Nan Zhu http://codingcat.me

tests blocked at "don't call ssc.stop in listener"

2015-11-26 Thread Nan Zhu
https://issues.apache.org/jira/browse/SPARK-12021 Best, -- Nan Zhu http://codingcat.me

Re: A proposal for Spark 2.0

2015-11-12 Thread Nan Zhu
Being specific to Parameter Server, I think the current agreement is that PS shall exist as a third-party library instead of a component of the core code base, isn’t? Best, -- Nan Zhu http://codingcat.me On Thursday, November 12, 2015 at 9:49 AM, wi...@qq.com wrote: > Who has the i

Re: [SparkScore]Performance portal for Apache Spark - WW26

2015-06-26 Thread Nan Zhu
Thank you, Jie! Very nice work! -- Nan Zhu http://codingcat.me On Friday, June 26, 2015 at 8:17 AM, Huang, Jie wrote: Correct. Your calculation is right! We have been aware of that kmeans performance drop also. According to our observation, it is caused by some unbalanced

Re: [SparkScore]Performance portal for Apache Spark - WW26

2015-06-26 Thread Nan Zhu
, what happened to k-means in HiBench? Best, -- Nan Zhu http://codingcat.me On Friday, June 26, 2015 at 7:24 AM, Huang, Jie wrote: Intel® Xeon® CPU E5-2697

Re: Welcoming three new committers

2015-02-03 Thread Nan Zhu
Congratulations! -- Nan Zhu http://codingcat.me On Tuesday, February 3, 2015 at 8:08 PM, Xuefeng Wu wrote: Congratulations!well done. Yours, Xuefeng Wu 吴雪峰 敬上 On 2015年2月4日, at 上午6:34, Matei Zaharia matei.zaha...@gmail.com (mailto:matei.zaha...@gmail.com) wrote: Hi all

Re: missing document of several messages in actor-based receiver?

2015-01-09 Thread Nan Zhu
Hi, I have created the PR for these two issues Best, -- Nan Zhu http://codingcat.me On Friday, January 9, 2015 at 7:38 AM, Nan Zhu wrote: Thanks, TD, I just created 2 JIRAs to track these, https://issues.apache.org/jira/browse/SPARK-5174 https://issues.apache.org/jira

missing document of several messages in actor-based receiver?

2015-01-08 Thread Nan Zhu
to AtomicInteger ? Best, -- Nan Zhu http://codingcat.me

Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Nan Zhu
BTW, this PR https://github.com/apache/spark/pull/2524 is related to a blocker level bug, and this is actually close to be merged (have been reviewed for several rounds) I would appreciated if anyone can continue the process, @mateiz -- Nan Zhu http://codingcat.me On Thursday, November

Re: [VOTE] Designating maintainers for some Spark components

2014-11-05 Thread Nan Zhu
, -- Nan Zhu On Wednesday, November 5, 2014 at 8:33 PM, Matei Zaharia wrote: BTW, my own vote is obviously +1 (binding). Matei On Nov 5, 2014, at 5:31 PM, Matei Zaharia matei.zaha...@gmail.com (mailto:matei.zaha...@gmail.com) wrote: Hi all, I wanted to share a discussion

Re: serialVersionUID incompatible error in class BlockManagerId

2014-10-24 Thread Nan Zhu
to replace cluster jar with branch-jdbc-1.0 jar file….. Best, -- Nan Zhu On Friday, October 24, 2014 at 9:23 PM, Josh Rosen wrote: Are all processes (Master, Worker, Executors, Driver) running the same Spark build? This error implies that you’re seeing protocol / binary incompatibilities

Re: something wrong with Jenkins or something untested merged?

2014-10-21 Thread Nan Zhu
just curious…what is this “NewSparkPullRequestBuilder”? Best, -- Nan Zhu On Tuesday, October 21, 2014 at 8:30 AM, Cheng Lian wrote: Hm, seems that 7u71 comes back again. Observed similar Kinesis compilation error just now: https://amplab.cs.berkeley.edu/jenkins/job

Re: something wrong with Jenkins or something untested merged?

2014-10-21 Thread Nan Zhu
weird…..two buildings (one triggered by New, one triggered by Old) were executed in the same node, amp-jenkins-slave-01, one compiles, one not… Best, -- Nan Zhu On Tuesday, October 21, 2014 at 9:39 AM, Nan Zhu wrote: seems that all PRs built by NewSparkPRBuilder suffers from 7u71, while

Re: something wrong with Jenkins or something untested merged?

2014-10-21 Thread Nan Zhu
not sure what's causing this. - Josh On October 21, 2014 at 6:35:39 AM, Nan Zhu (zhunanmcg...@gmail.com) wrote: weird.two buildings (one triggered by New, one triggered by Old) were executed in the same node, amp-jenkins-slave-01, one compiles, one not... Best, -- Nan

something wrong with Jenkins or something untested merged?

2014-10-20 Thread Nan Zhu
Hi, I just submitted a patch https://github.com/apache/spark/pull/2864/files with one line change but the Jenkins told me it's failed to compile on the unrelated files? https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21935/console Best, Nan

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Nan Zhu
yes, I can compile locally, too but it seems that Jenkins is not happy now...https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/ All failed to compile Best, -- Nan Zhu On Monday, October 20, 2014 at 7:56 PM, Ted Yu wrote: I performed build on latest master branch

Re: Breaking the previous large-scale sort record with Spark

2014-10-10 Thread Nan Zhu
Great! Congratulations! -- Nan Zhu On Friday, October 10, 2014 at 11:19 AM, Mridul Muralidharan wrote: Brilliant stuff ! Congrats all :-) This is indeed really heartening news ! Regards, Mridul On Fri, Oct 10, 2014 at 8:24 PM, Matei Zaharia matei.zaha...@gmail.com

Re: jenkins downtime/system upgrade wednesday morning, 730am PDT

2014-09-29 Thread Nan Zhu
/SparkPullRequestBuilder/lib/apache-rat-0.10.jar Error: Invalid or corrupt jarfile /home/jenkins/workspace/SparkPullRequestBuilder/lib/apache-rat-0.10.jar RAT checks passed. Something wrong? Best, -- Nan Zhu On Monday, September 29, 2014 at 4:43 PM, shane knapp wrote: happy monday, everyone

Re: executorAdded event to DAGScheduler

2014-09-26 Thread Nan Zhu
such a deployment mode Best, -- Nan Zhu On Friday, September 26, 2014 at 8:02 AM, praveen seluka wrote: Can someone explain the motivation behind passing executorAdded event to DAGScheduler ? DAGScheduler does submitWaitingStages when executorAdded method is called

Re: A couple questions about shared variables

2014-09-24 Thread Nan Zhu
I proposed a fix https://github.com/apache/spark/pull/2524 Glad to receive feedbacks -- Nan Zhu On Tuesday, September 23, 2014 at 9:06 PM, Sandy Ryza wrote: Filed https://issues.apache.org/jira/browse/SPARK-3642 for documenting these nuances. -Sandy On Mon, Sep 22, 2014 at 10

do MIMA checking before all test cases start?

2014-09-24 Thread Nan Zhu
compatibility issues, you just need to do some minor changes, but in the current environment, you can only get if your change works after all test cases finished (1 hour later…) Best, -- Nan Zhu

Re: do MIMA checking before all test cases start?

2014-09-24 Thread Nan Zhu
yeah, I tried that, but there is always an issue when I ran dev/mima, it always gives me some binary compatibility error on Java API part…. so I have to wait for Jenkins’ result when fixing MIMA issues -- Nan Zhu On Thursday, September 25, 2014 at 12:04 AM, Patrick Wendell wrote: Have

Re: A couple questions about shared variables

2014-09-22 Thread Nan Zhu
, while others implementing something like Hadoop counters may need the current implementation (count everything happened, including the duplications) Your thoughts? -- Nan Zhu On Sunday, September 21, 2014 at 6:35 PM, Matei Zaharia wrote: Hmm, good point, this seems to have been broken

Re: A couple questions about shared variables

2014-09-22 Thread Nan Zhu
I see, thanks for pointing this out -- Nan Zhu On Monday, September 22, 2014 at 12:08 PM, Sandy Ryza wrote: MapReduce counters do not count duplications. In MapReduce, if a task needs to be re-run, the value of the counter from the second task overwrites the value from the first

Re: Some Serious Issue with Spark Streaming ? Blocks Getting Removed and Jobs have Failed..

2014-09-11 Thread Nan Zhu
Hi, Can you attach more logs to see if there is some entry from ContextCleaner? I met very similar issue before…but haven’t get resolved Best, -- Nan Zhu On Thursday, September 11, 2014 at 10:13 AM, Dibyendu Bhattacharya wrote: Dear All, Not sure if this is a false alarm

Re: Some Serious Issue with Spark Streaming ? Blocks Getting Removed and Jobs have Failed..

2014-09-11 Thread Nan Zhu
) at java.lang.Thread.run(Thread.java:744) -- Nan Zhu On Thursday, September 11, 2014 at 10:42 AM, Nan Zhu wrote: Hi, Can you attach more logs to see if there is some entry from ContextCleaner? I met very similar issue before…but haven’t get resolved Best, -- Nan Zhu

jenkins failed all tests?

2014-09-07 Thread Nan Zhu
Hi, all I just modified some document, but still failed to pass tests? https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19950/consoleFull Anyone can look at the problem? Best, -- Nan Zhu

Re: jenkins failed all tests?

2014-09-07 Thread Nan Zhu
Hi, Sean, Thanks for the reply Here are the updated files: https://github.com/apache/spark/pull/2312/files just two md files... Best, -- Nan Zhu On Sunday, September 7, 2014 at 4:30 PM, Sean Owen wrote: It would help to point to your change. Are you sure it was only docs and are you

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Nan Zhu
+1 tested thrift server with our in-house application, everything works fine -- Nan Zhu On Wednesday, September 3, 2014 at 4:43 PM, Matei Zaharia wrote: +1 Matei On September 3, 2014 at 12:24:32 PM, Cheng Lian (lian.cs@gmail.com (mailto:lian.cs@gmail.com)) wrote: +1

Re: branch-1.1 will be cut on Friday

2014-07-27 Thread Nan Zhu
, -- Nan Zhu On Sunday, July 27, 2014 at 2:31 PM, Patrick Wendell wrote: Hey All, Just a heads up, we'll cut branch-1.1 on this Friday, August 1st. Once the release branch is cut we'll start community QA and go into the normal triage process for merging patches into that branch. For Spark

new JDBC server test cases seems failed ?

2014-07-27 Thread Nan Zhu
0 [info] *** 2 TESTS FAILED *** Best, -- Nan Zhu Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

spark.executor.memory is not applicable when running unit test in Jenkins?

2014-07-21 Thread Nan Zhu
that? Thanks, -- Nan Zhu

Re: Pull requests will be automatically linked to JIRA when submitted

2014-07-20 Thread Nan Zhu
Awesome! On Saturday, July 19, 2014, Patrick Wendell pwend...@gmail.com wrote: Just a small note, today I committed a tool that will automatically mirror pull requests to JIRA issues, so contributors will no longer have to manually post a pull request on the JIRA when they make one. It will

Re: how to run the program compiled with spark 1.0.0 in the branch-0.1-jdbc cluster

2014-07-14 Thread Nan Zhu
Ah, sorry, sorry It's executorState under deploy package On Monday, July 14, 2014, Patrick Wendell pwend...@gmail.com wrote: 1. The first error I met is the different SerializationVersionUID in ExecuterStatus I resolved by explicitly declare SerializationVersionUID in

Re: how to run the program compiled with spark 1.0.0 in the branch-0.1-jdbc cluster

2014-07-14 Thread Nan Zhu
I resolved the issue by setting an internal maven repository to contain the Spark-1.0.1 jar compiled from branch-0.1-jdbc and replacing the dependency to the central repository with our own repository I believe there should be some more lightweight way Best, -- Nan Zhu On Monday, July 14

assign SPARK-2126 to me?

2014-06-19 Thread Nan Zhu
Hi, all Any admin can assign this issue https://issues.apache.org/jira/browse/SPARK-2126 to me? I have started working on this Thanks, -- Nan Zhu

anyone can mark this issue as resolved?

2014-06-17 Thread Nan Zhu
Hi, Just found it occasionally https://issues.apache.org/jira/browse/SPARK-1471 Best, -- Nan Zhu

Re: Add my JIRA username (hsaputra) to Spark's contributor's list

2014-06-03 Thread Nan Zhu
I think I lost that permission too? Patrick once helped to recover the permission, but I lost that permission again? username is CodingCat, or Nan Zhu (I’m not sure which one you use when doing this)? Best, -- Nan Zhu On Tuesday, June 3, 2014 at 2:39 PM, Henry Saputra wrote: Thanks

Re: Streaming example stops outputting (Java, Kafka at least)

2014-05-30 Thread Nan Zhu
Hi, Sean I was in the same problem but when I changed MASTER=“local” to MASTER=“local[2]” everything back to the normal Hasn’t get a chance to ask here Best, -- Nan Zhu On Friday, May 30, 2014 at 9:09 AM, Sean Owen wrote: Guys I'm struggling to debug some strange behavior

Re: Streaming example stops outputting (Java, Kafka at least)

2014-05-30 Thread Nan Zhu
StreamingContext(local, NetworkWordCount, Seconds(1)) http://spark.apache.org/docs/latest/streaming-programming-guide.html I created a JIRA and a PR https://github.com/apache/spark/pull/924 -- Nan Zhu On Friday, May 30, 2014 at 1:53 PM, Patrick Wendell wrote: Yeah - Spark streaming needs at least two

Re: spark 1.0 standalone application

2014-05-19 Thread Nan Zhu
en, you have to put spark-assembly-*.jar to the lib directory of your application Best, -- Nan Zhu On Monday, May 19, 2014 at 9:48 PM, nit wrote: I am not much comfortable with sbt. I want to build a standalone application using spark 1.0 RC9. I can build sbt assembly for my application

Re: [VOTE] Release Apache Spark 1.0.0 (rc9)

2014-05-19 Thread Nan Zhu
just rerun my test on rc5 everything works build applications with sbt and the spark-*.jar which is compiled with Hadoop 2.3 +1 -- Nan Zhu On Sunday, May 18, 2014 at 11:07 PM, witgo wrote: How to reproduce this bug? -- Original -- From: Patrick

Re: Spark 1.0.0 rc3

2014-05-03 Thread Nan Zhu
SPARK_HADOOP_VERSION=2.3.0 sbt/sbt assembly and copy the generated jar to lib/ directory of my application, it seems that sbt cannot find the dependencies in the jar? but everything works with the pre-built jar files downloaded from the link provided by Patrick Best, -- Nan Zhu

Re: Any plans for new clustering algorithms?

2014-04-21 Thread Nan Zhu
I thought those are files of spark.apache.org? -- Nan Zhu On Monday, April 21, 2014 at 9:09 PM, Xiangrui Meng wrote: The markdown files are under spark/docs. You can submit a PR for changes. -Xiangrui On Mon, Apr 21, 2014 at 6:01 PM, Sandy Ryza sandy.r...@cloudera.com (mailto:sandy.r

Re: Flaky streaming tests

2014-04-07 Thread Nan Zhu
I met this issue when Jenkins seems to be very busy On Monday, April 7, 2014, Kay Ousterhout k...@eecs.berkeley.edu wrote: Hi all, The InputStreamsSuite seems to have some serious flakiness issues -- I've seen the file input stream fail many times and now I'm seeing some actor input

a weird test case in Streaming

2014-03-29 Thread Nan Zhu
: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13561/ Mark: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13531/ Best, -- Nan Zhu

Re: Travis CI

2014-03-29 Thread Nan Zhu
aspect, I’m just reporting what I saw and hope that can help you to identify the problem Thank you -- Nan Zhu On Tuesday, March 25, 2014 at 10:11 PM, Patrick Wendell wrote: Ya It's been a little bit slow lately because of a high error rate in interactions with the git-hub API

Re: Migration to the new Spark JIRA

2014-03-29 Thread Nan Zhu
That’s great! Andy, thank you for all your contributions to the community ! Best, -- Nan Zhu On Saturday, March 29, 2014 at 11:40 PM, Patrick Wendell wrote: Hey All, We've successfully migrated the Spark JIRA to the Apache infrastructure. This turned out to be a huge effort, lead

Re: Mailbomb from amplabs jenkins ?

2014-03-27 Thread Nan Zhu
yes, it sends for every PR you were involved I think Patrick is doing something on Jenkins, he just stopped some testing jobs manually Best, -- Nan Zhu On Thursday, March 27, 2014 at 11:07 PM, Mridul Muralidharan wrote: Got some 100 odd mails from jenkins (?) with Can one of the admins

Re: Travis CI

2014-03-25 Thread Nan Zhu
I assume the Jenkins is not working now? Best, -- Nan Zhu On Tuesday, March 25, 2014 at 6:42 PM, Michael Armbrust wrote: Just a quick note to everyone that Patrick and I are playing around with Travis CI on the Spark github repository. For now, travis does not run all of the test cases

Re: Travis CI

2014-03-25 Thread Nan Zhu
I just found that the Jenkins is not working from this afternoon for one PR, the first time build failed after 90 minutes, the second time it has run for more than 2 hours, no result is returned Best, -- Nan Zhu On Tuesday, March 25, 2014 at 10:06 PM, Patrick Wendell wrote: That's

How the scala style checker works?

2014-03-19 Thread Nan Zhu
#L515 but the current scala style checker passes this line? Best, -- Nan Zhu

ping of PR #12

2014-03-10 Thread Nan Zhu
very much! -- Nan Zhu

Undocumented configuration parameters

2014-03-05 Thread Nan Zhu
the PR, what’s the reason of the missing documentations, the contributor forgot to update the docs, or they are intended to be hidden maybe some parameters are not expected to be changed by the user? Best, -- Nan Zhu

Re: Spark JIRA

2014-02-28 Thread Nan Zhu
I think they are working on it? https://issues.apache.org/jira/browse/SPARK Best, -- Nan Zhu On Friday, February 28, 2014 at 2:29 PM, Evan Chan wrote: Hey guys, There is no plan to move the Spark JIRA from the current https://spark-project.atlassian.net/ right? -- -- Evan

Discussion on SPARK-1139

2014-02-26 Thread Nan Zhu
this design, but it will introduce some compatibility issue Just bring it here for your advices Best, -- Nan Zhu