Re: [VOTE] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-09 Thread Shixiong Zhu
+1 (binding) Best Regards, Shixiong Zhu On Tue, Jan 9, 2024 at 6:47 PM 刘唯 wrote: > This is a good addition! +1 > > Raghu Angadi 于2024年1月9日周二 13:17写道: > >> +1. This is a major improvement to the state API. >> >> Raghu. >> >> On Tue, Jan 9, 2024 at 1:

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-05 Thread Shixiong Zhu
+1. Looking forward to seeing how the new API brings in new streaming use cases! Best Regards, Shixiong Zhu On Wed, Nov 29, 2023 at 6:42 PM Anish Shrigondekar wrote: > Hi dev, > > Addressed the comments that Jungtaek had on the doc. Bumping the thread > once again to see if othe

Re: [VOTE] SPIP: State Data Source - Reader

2023-10-25 Thread Shixiong Zhu
+1 Best Regards, Shixiong Zhu On Wed, Oct 25, 2023 at 4:20 PM Yuanjian Li wrote: > +1 > > Jungtaek Lim 于2023年10月25日周三 01:06写道: > >> Friendly reminder: the VOTE thread got 2 binding votes and needs 1 more >> binding vote to pass. >> >> On Wed, Oct 2

Re: [DISCUSS] Deprecate DStream in 3.4

2023-01-12 Thread Shixiong Zhu
+1 On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das wrote: > +1 > > On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon wrote: > >> +1 >> >> On Fri, 13 Jan 2023 at 08:51, Jungtaek Lim >> wrote: >> >>> bump for more visibility. >>> >>> On Wed, Jan 11, 2023 at 12:20 PM Jungtaek Lim < >>>

Re: [VOTE][SPIP] Asynchronous Offset Management in Structured Streaming

2022-11-30 Thread Shixiong Zhu
+1 On Wed, Nov 30, 2022 at 8:04 PM Hyukjin Kwon wrote: > +1 > > On Thu, 1 Dec 2022 at 12:39, Mridul Muralidharan wrote: > >> >> +1 >> >> Regards, >> Mridul >> >> On Wed, Nov 30, 2022 at 8:55 PM Xingbo Jiang >> wrote: >> >>> +1 >>> >>> On Wed, Nov 30, 2022 at 5:59 PM Jungtaek Lim < >>>

Re: [DISCUSSION] SPIP: Asynchronous Offset Management in Structured Streaming

2022-11-30 Thread Shixiong Zhu
+1 This is exciting. I agree with Jerry that this SPIP and continuous processing are orthogonal. This SPIP itself would be a great improvement and impact most Structured Streaming users. Best Regards, Shixiong On Wed, Nov 30, 2022 at 6:57 AM Mridul Muralidharan wrote: > > Thanks for all the

Welcome Jose Torres as a Spark committer

2019-01-29 Thread Shixiong Zhu
Hi all, The Apache Spark PMC recently added Jose Torres as a committer on the project. Jose has been a major contributor to Structured Streaming. Please join me in welcoming him! Best Regards, Shixiong Zhu

Re: Spark streaming 1.6.0-RC4 NullPointerException using mapWithState

2015-12-29 Thread Shixiong Zhu
Could you create a JIRA? We can continue the discussion there. Thanks! Best Regards, Shixiong Zhu 2015-12-29 3:42 GMT-08:00 Jan Uyttenhove <j...@insidin.com>: > Hi guys, > > I upgraded to the RC4 of Spark (streaming) 1.6.0 to (re)test the new > mapWithState API, after previous

Re: A bug in Spark standalone? Worker registration and deregistration

2015-12-10 Thread Shixiong Zhu
Jacek, could you create a JIRA for it? I just reproduced it. It's a bug in how Master handles the Worker disconnection. Best Regards, Shixiong Zhu 2015-12-10 2:45 GMT-08:00 Jacek Laskowski <ja...@japila.pl>: > Hi, > > I'm on yesterday's master HEAD. > > Pozdrawiam, &

Re: NettyRpcEnv adverisedPort

2015-11-26 Thread Shixiong Zhu
I think you are right. The executor gets the driver port from "RpcEnv.address". Best Regards, Shixiong Zhu 2015-11-26 11:45 GMT-08:00 Rad Gruchalski <ra...@gruchalski.com>: > Dear all, > > I am currently looking at modifying NettyRpcEnv for this PR: > https://githu

Re: tests blocked at "don't call ssc.stop in listener"

2015-11-26 Thread Shixiong Zhu
Just found a potential dead-lock in this test. Will send a PR to fix it soon. Best Regards, Shixiong Zhu 2015-11-26 18:55 GMT-08:00 Saisai Shao <sai.sai.s...@gmail.com>: > Might be related to this JIRA ( > https://issues.apache.org/jira/browse/SPARK-11761), not very sure about >

Re: Why there's no api for SparkContext#textFiles to support multiple inputs ?

2015-11-11 Thread Shixiong Zhu
In addition, if you have more than two text files, you can just put them into a Seq and use "reduce(_ ++ _)". Best Regards, Shixiong Zhu 2015-11-11 10:21 GMT-08:00 Jakob Odersky <joder...@gmail.com>: > Hey Jeff, > Do you mean reading from multiple text files? In that c

Re: [SQL] Memory leak with spark streaming and spark sql in spark 1.5.1

2015-10-15 Thread Shixiong Zhu
Thanks for reporting it Terry. I submitted a PR to fix it: https://github.com/apache/spark/pull/9132 Best Regards, Shixiong Zhu 2015-10-15 2:39 GMT+08:00 Reynold Xin <r...@databricks.com>: > +dev list > > On Wed, Oct 14, 2015 at 1:07 AM, Terry Hoo <hujie.ea...@gmail.co

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Shixiong Zhu
tabase } else { // update to new state and save to database } // return new state } TaskContext.get().addTaskCompletionListener(_ => db.disconnect()) } Best Regards, Shixiong Zhu 2015-09-24 17:42 GMT+08:00 Bin Wang <wbi...@gmail.com>: > It

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Shixiong Zhu
ra/browse/SPARK-2629 but doesn't have a doc now... Best Regards, Shixiong Zhu 2015-09-24 17:26 GMT+08:00 Bin Wang <wbi...@gmail.com>: > Data that are not updated should be saved earlier: while the data added to > the DStream at the first time, it should be considered as updated. So save

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Shixiong Zhu
For data that are not updated, where do you save? Or do you only want to avoid accessing database for those that are not updated? Besides, the community is working on optimizing "updateStateBykey"'s performance. Hope it will be delivered soon. Best Regards, Shixiong Zhu 2015-09-24 13

Re: 答复: bug in Worker.scala, ExecutorRunner is not serializable

2015-09-18 Thread Shixiong Zhu
I'm wondering if we should create a tag trait (e.g., LocalMessage) for messages like this and add the comment in the trait. Looks better than adding inline comments for all these messages. Best Regards, Shixiong Zhu 2015-09-18 15:10 GMT+08:00 Reynold Xin <r...@databricks.com>: > Maybe

Re: bug in Worker.scala, ExecutorRunner is not serializable

2015-09-17 Thread Shixiong Zhu
RequestWorkerState is an internal message between Worker and WorkerWebUI. Since they are in the same process, that's fine. Actually, these are not public APIs. Could you elaborate your use case? Best Regards, Shixiong Zhu 2015-09-17 16:36 GMT+08:00 Huangguowei <huangguo...@huawei.

Re: Welcoming three new committers

2015-02-03 Thread Shixiong Zhu
Congrats guys! Best Regards, Shixiong Zhu 2015-02-04 6:34 GMT+08:00 Matei Zaharia matei.zaha...@gmail.com: Hi all, The PMC recently voted to add three new committers: Cheng Lian, Joseph Bradley and Sean Owen. All three have been major contributors to Spark in the past year: Cheng on Spark

Why the major.minor version of the new hive-exec is 51.0?

2014-12-30 Thread Shixiong Zhu
(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:169) at Test.main(Test.java:5) Best Regards, Shixiong Zhu

Re: Announcing Spark 1.2!

2014-12-19 Thread Shixiong Zhu
Congrats! A little question about this release: Which commit is this release based on? v1.2.0 and v1.2.0-rc2 are pointed to different commits in https://github.com/apache/spark/releases Best Regards, Shixiong Zhu 2014-12-19 16:52 GMT+08:00 Patrick Wendell pwend...@gmail.com: I'm happy

Re: About implicit rddToPairRDDFunctions

2014-11-13 Thread Shixiong Zhu
about the implicit search logic: http://eed3si9n.com/revisiting-implicits-without-import-tax To maintain the compatibility, we can keep `rddToPairRDDFunctions` in the SparkContext but remove `implicit`. The disadvantage is there are two copies of same codes. Best Regards, Shixiong Zhu 2014-11

Re: About implicit rddToPairRDDFunctions

2014-11-13 Thread Shixiong Zhu
OK. I'll take it. Best Regards, Shixiong Zhu 2014-11-14 12:34 GMT+08:00 Reynold Xin r...@databricks.com: That seems like a great idea. Can you submit a pull request? On Thu, Nov 13, 2014 at 7:13 PM, Shixiong Zhu zsxw...@gmail.com wrote: If we put the `implicit` into pacakge object rdd

About implicit rddToPairRDDFunctions

2014-11-06 Thread Shixiong Zhu
: Ordering[K] = null) = { new PairRDDFunctions(rdd) } If so, the converting will be automatic and not need to import org.apache.spark.SparkContext._ I tried to search some discussion but found nothing. Best Regards, Shixiong Zhu