Re: Asking for reviewing PRs regarding structured streaming

2018-07-05 Thread Jungtaek Lim
Ted Yu suggested posting the improved numbers to this thread and I think it's good idea, so also posting here, but I also think explaining rationalization of my issues would help understanding why I'm submitting couple of patches, so I'll explain it first. (Sorry to post a wall of text). tl;dr.

Re: Asking for reviewing PRs regarding structured streaming

2018-07-05 Thread Jungtaek Lim
Bump. I have been having hard time working on making additional PRs since some of these rely on non-merged PRs, so spending additional time to decouple these things if possible. https://github.com/apache/spark/pulls/HeartSaVioR Pending 5 PRs so far, and may add more sooner or later. Thanks,

[Streaming][Kinesis][SPARK-20168] Could I get some reviews of the patch that resolves kinesis timestamp resume

2018-07-05 Thread Yash Sharma
Hi Team, Could I get some review at the patch here. Would love to hear suggestions here on the patch. I had to reopen SPARK-20168 because of this bug. https://github.com/apache/spark/pull/21541 https://issues.apache.org/jira/browse/SPARK-20168 Cheers, Yash

[SS] Trigger.Once not working with watermarked event time

2018-07-05 Thread Christopher Horn
Does anyone who worked on SS or is familiar with it have an opinion on this?: https://issues.apache.org/jira/browse/SPARK-24699 https://github.com/apache/spark/pull/21676 I don't like the idea of making Trigger.Once execute two batches, but it is the simplest way to get the desired batch state

Unsubscribe

2018-07-05 Thread xu han

what about put broadcast to offheap?

2018-07-05 Thread 吴晓菊
Now all broadcast data are put on heap since the StorageLevel is hard-coded to MEMORY_AND_DISK_SER in TorrentBroadcast. So it's highly possible the broadcast data will goto oldgen and then may have impact on full-gc. So what about put broadcast to offheap? Chrysan Wu Phone:+86 17717640807

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

2018-07-05 Thread Chetan Khatri
Prem sure, Thanks for suggestion. On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure wrote: > try .pipe(.py) on RDD > > Thanks, > Prem > > On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri > wrote: > >> Can someone please suggest me , thanks >> >> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, >> wrote: >>