Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Liang-Chi Hsieh
Just got reply from CRAN admin. It should be fixed now. Hyukjin Kwon wrote > Thanks, Liang-chi. > > On Thu, 13 Dec 2018, 8:29 am Liang-Chi Hsieh < > viirya@ > wrote: > > > >> Sorry for late. There is a malformed record at CRAN package page again. >> I've >> already asked CRAN admin for hel

classpath entries with hdfs path

2018-12-12 Thread sandeep_katta
Hi All; I have use case where some of the jars on HDFS, these jars I want to include in my driver class path if I pass with --jars it works fine, but if I pass using spark.driver.extraClassPath it is failed spark-sql --master yarn --jars hdfs://hacluster/tmp/testjar/* //Jars are loaded to the cl

dsv2 remaining work

2018-12-12 Thread Reynold Xin
Unfortunately I can't make it to the DSv2 sync today. Sending an email with my thoughts instead. I spent a few hours thinking about this. It's evident that progress has been slow, because this is an important API and people from different perspectives have very different requirements, and the pr

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon
Thanks, Liang-chi. On Thu, 13 Dec 2018, 8:29 am Liang-Chi Hsieh > Sorry for late. There is a malformed record at CRAN package page again. > I've > already asked CRAN admin for help. It should be fixed soon according to > past > experience. > > Related discussion will be in > https://issues.apache

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Liang-Chi Hsieh
Sorry for late. There is a malformed record at CRAN package page again. I've already asked CRAN admin for help. It should be fixed soon according to past experience. Related discussion will be in https://issues.apache.org/jira/browse/SPARK-24152. I will post here if I get reply from CRAN admin.

Re: Self join

2018-12-12 Thread Ryan Blue
Marco, I'm actually asking for a design doc that clearly states the problem and proposes a solution. This is a substantial change and probably should be an SPIP. I think that would be more likely to generate discussion than referring to PRs or a quick paragraph on the dev list, because the only p

Re: Ask for reviewing on Structured Streaming PRs

2018-12-12 Thread Vaclav Kosar
I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days. [3] https://github.com/apache/spark/pull/21919 On 12. 12. 18 14:37, Dongjin Lee wrote: If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was ad

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Liang-Chi Hsieh
Thanks for letting me know! I will look into it and ask CRAN admin for help. Hyukjin Kwon wrote > Looks it's happening again. Liang-Chi, do you mind if I ask it again? > > FYI, R 3.4 is officially deprecated as of > https://github.com/apache/spark/pull/23012 > We could upgrade R version to 3.4

Re: Ask for reviewing on Structured Streaming PRs

2018-12-12 Thread Dongjin Lee
If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark. Thanks, Dongjin [^1]: https://github.com/apache/spark/pull/22282 [^2]: https://issues.apache.org/jira/browse/KAFKA-4208 On Wed, Dec 12, 2018 at 6:4

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon
Looks it's happening again. Liang-Chi, do you mind if I ask it again? FYI, R 3.4 is officially deprecated as of https://github.com/apache/spark/pull/23012 We could upgrade R version to 3.4.x in Jenkins, which deals with the malformed(?) responses after 3.0 release. Then, we could get rid of this p

Re: Behavior of checkpointLocation from options vs setting conf spark.sql.streaming.checkpointLocation

2018-12-12 Thread Shubham Chaurasia
Thanks Gabor. On Wed, Dec 12, 2018, 4:06 PM Gabor Somogyi Hi Shubham, > > I've just checked the latest master branch and I can confirm it works as > you've described. > As a workaround one can read the ** in the directory > structure and can be set with .queryName("") before > restart. > > BR, >

Re: Behavior of checkpointLocation from options vs setting conf spark.sql.streaming.checkpointLocation

2018-12-12 Thread Gabor Somogyi
Hi Shubham, I've just checked the latest master branch and I can confirm it works as you've described. As a workaround one can read the ** in the directory structure and can be set with .queryName("") before restart. BR, G On Tue, Dec 11, 2018 at 6:45 AM Shubham Chaurasia wrote: > Hi, > > I w

Re: Self join

2018-12-12 Thread Marco Gaido
Thank you all for your answers. @Ryan Blue sure, let me state the problem more clearly: imagine you have 2 dataframes with a common lineage (for instance one is derived from the other by some filtering or anything you prefer). And imagine you want to join these 2 dataframes. Currently, there is a

Ask for reviewing on Structured Streaming PRs

2018-12-12 Thread Jungtaek Lim
Hi devs, Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too. Thanks in advance, Jungtaek Lim (HeartSaVio