Hi,
Why not group by first then join?
BTW, I don’t think there any difference between ‘distinct’ and ‘group by’
Source code of 2.1:
def distinct(): Dataset[T] = dropDuplicates()
…
def dropDuplicates(colNames: Seq[String]): Dataset[T] = withTypedPlan {
…
Aggregate(groupCols, aggCols, logicalPlan)
Hi Gurdit Singh
Thanks. It is very helpful.
发件人: Gurdit Singh [mailto:gurdit.si...@bitwiseglobal.com]
发送时间: 2017年2月22日 13:31
收件人: Linyuxin <linyu...@huawei.com>; Irving Duran <irving.du...@gmail.com>;
Yong Zhang <java8...@hotmail.com>
抄送: Jacek Laskowski <ja...@j
Actually,I want a standalone jar as I can check the syntax without spark
execution environment
发件人: Irving Duran [mailto:irving.du...@gmail.com]
发送时间: 2017年2月21日 23:29
收件人: Yong Zhang <java8...@hotmail.com>
抄送: Jacek Laskowski <ja...@japila.pl>; Linyuxin <linyu...@huawei.co
Hi All,
Is there any tool/api to check the sql syntax without running spark job
actually?
Like the siddhiQL on storm here:
SiddhiManagerService. validateExecutionPlan
https://github.com/wso2/siddhi/blob/master/modules/siddhi-core/src/main/java/org/wso2/siddhi/core/SiddhiManagerService.java
it
Thanks.
发件人: Naveen [mailto:hadoopst...@gmail.com]
发送时间: 2016年12月25日 0:33
收件人: Linyuxin <linyu...@huawei.com>
抄送: user <user@spark.apache.org>
主题: Re: 答复: submit spark task on yarn asynchronously via java?
Hi,
Please use SparkLauncher API class and invoke the threads using async
Hi,
Could Anybody help?
发件人: Linyuxin
发送时间: 2016年12月22日 14:18
收件人: user <user@spark.apache.org>
主题: submit spark task on yarn asynchronously via java?
Hi All,
Version:
Spark 1.5.1
Hadoop 2.7.2
Is there any way to submit and monitor spark task on yarn via java
asynchronously?
Hi All,
Version:
Spark 1.5.1
Hadoop 2.7.2
Is there any way to submit and monitor spark task on yarn via java
asynchronously?
Hi All,
I want to know how to avoid sql injection on SparkSQL
Is there any common pattern about this?
e.g. some useful tool or code segment
or just create a “wheel” on SparkSQL myself.
Thanks.
Hi ALL
Is there any reference of performance tuning on SparkSQL?
I can only find about turning on spark core on http://spark.apache.org/
Hi All
Newbee here.
My spark version is 1.5.1
And I want to know how can I find the Specification of Spark SQL to find out
that if it is supported ‘a like %b_xx’ or other sql syntax
10 matches
Mail list logo