Re: Revisiting Online serving of Spark models?

2018-05-31 Thread Chris Fregly
Hey everyone! @Felix: thanks for putting this together. i sent some of you a quick calendar event - mostly for me, so i don’t forget! :) Coincidentally, this is the focus of June 6th's Advanced Spark and TensorFlow Meetup

Re: Feedback on first commit + jira issue I opened

2018-05-31 Thread Bryan Cutler
Hi Andrew, Please just go ahead and make the pull request. It's easier to review and give feedback, thanks! Bryan On Thu, May 31, 2018 at 9:44 AM, Long, Andrew wrote: > Hello Friends, > > > > I’m a new committer and I’ve submitted my first patch and I had some > questions about documentation

Re: [VOTE] SPIP ML Pipelines in R

2018-05-31 Thread Joseph Bradley
Hossein might be slow to respond (OOO), but I just commented on the JIRA. I'd recommend we follow the same process as the SparkR package. +1 on this from me (and I'll be happy to help shepherd it, though Felix and Shivaram are the experts in this area). CRAN presents challenges, but this is a

REMINDER: Apache EU Roadshow 2018 in Berlin is less than 2 weeks away!

2018-05-31 Thread sharan
Hello Apache Supporters and Enthusiasts This is a reminder that our Apache EU Roadshow in Berlin is less than two weeks away and we need your help to spread the word. Please let your work colleagues, friends and anyone interested in any attending know about our Apache EU Roadshow event. We

[Spark SQL Discuss] Better support for Partitioning and Bucketing when used together

2018-05-31 Thread pnpranavrao
Hello, We use partitioned + bucketed datasets for use-cases where we can afford to take a perf hit at write time, so that reads are optimised. But I feel Spark could more optimally exploit the data layout in query planning. Here I describe why this is a problem, and how it could be improved.

Feedback on first commit + jira issue I opened

2018-05-31 Thread Long, Andrew
Hello Friends, I’m a new committer and I’ve submitted my first patch and I had some questions about documentation standards. In my patch(jira below) I’ve added a config parameter to adjust the number of records show when a user calls .show() on a dataframe. I was hoping someone could double

Re: [VOTE] SPIP ML Pipelines in R

2018-05-31 Thread Shivaram Venkataraman
Hossein -- Can you clarify what the resolution on the repository / release issue discussed on SPIP ? Shivaram On Thu, May 31, 2018 at 9:06 AM, Felix Cheung wrote: > +1 > With my concerns in the SPIP discussion. > > > From: Hossein > Sent: Wednesday, May 30,

Re: [VOTE] SPIP ML Pipelines in R

2018-05-31 Thread Felix Cheung
+1 With my concerns in the SPIP discussion. From: Hossein Sent: Wednesday, May 30, 2018 2:03:03 PM To: dev@spark.apache.org Subject: [VOTE] SPIP ML Pipelines in R Hi, I started discussion

Re: MatrixUDT and VectorUDT in Spark ML

2018-05-31 Thread Li Jin
Please see https://issues.apache.org/jira/browse/SPARK-24258 On Wed, May 30, 2018 at 10:40 PM Dongjin Lee wrote: > How is this issue going? Is there any Jira ticket about this? > > Thanks, > Dongjin > > On Sat, Mar 24, 2018 at 1:39 PM, Himanshu Mohan < > himanshu.mo...@aexp.com.invalid> wrote: >

Re: [SQL] Purpose of RuntimeReplaceable unevaluable unary expressions?

2018-05-31 Thread Jacek Laskowski
Yay! That's right!!! Thanks Reynold. Such a short answer with so much information. Thanks. Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Kafka

Re: Spark version for Mesos 0.27.0

2018-05-31 Thread Thodoris Zois
Ok! Thank you very much! - Thodoris On Thu, 2018-05-31 at 11:30 +0200, Szuromi Tamás wrote: > I see it in the Serenity docs. Anyway, I guess you are able to use > the newest version of Spark with the Mesos 0.27 without any issues so > don't have to dispense newer Spark features and fixes. >

Re: Spark version for Mesos 0.27.0

2018-05-31 Thread Szuromi Tamás
I see it in the Serenity docs. Anyway, I guess you are able to use the newest version of Spark with the Mesos 0.27 without any issues so don't have to dispense newer Spark features and fixes. Thodoris Zois ezt írta (időpont: 2018. máj. 31., Cs, 11:22): > Hello, > > The reason is that I want to

Re: Spark version for Mesos 0.27.0

2018-05-31 Thread Thodoris Zois
Hello, The reason is that I want to make some tests and use the oversubscription feature of Mesos along with Spark. Intel and Mesosphere have built a project, called Serenity that actually measures the usage slack on each Mesos agent and returns resources to the cluster. Unfortunately, Serenity

Re: Spark version for Mesos 0.27.0

2018-05-31 Thread Szuromi Tamás
Hey, I'm sure we used Spark 1.6 on Mesos 0.27 as well but at that time we used with fine-grained scheduling and not dynamic allocation. Also, newer Spark versions should work with older mesos versions like 0.27. Why do you have Mesos 0.27 btw? cheers, Tamas Thodoris Zois ezt írta (időpont: