Updating Maven version to 3.3.9 solved the issue
Thanks everyone!
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/latest-Spark-build-error-tp15782p15787.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
This is because to build Spark requires maven 3.3.3 or later.
http://spark.apache.org/docs/latest/building-spark.html
Regards,
Kazuaki Ishizaki
From: salexln
To: dev@spark.apache.org
Date: 2015/12/25 15:52
Subject:latest Spark build error
Hi all,
I'm
Hi,
I am a new to Scala and Spark and trying to find relative API in
DataFrame to solve my problem as title described. However, I just only find
this API DataFrame.col(colName : String) : Column which returns an object of
Column. Not the content. If only DataFrame support such API
I think shuffle write size not dependency on the your data, but on the join
operation, maybe your join action, don't need to shuffle more data, because
the table data has already on its partition, so it not need shuffle write,
is it possible?
2015-12-25 0:53 GMT+08:00 gsvic
Hi all,
I'm getting build error when trying to build a clean version of latest
Spark. I did the following
1) git clone https://github.com/apache/spark.git
2) build/mvn -DskipTests clean package
But I get the following error:
Spark Project Parent POM .. FAILURE [2.338s]
Thanks, Jeff. It’s not choose some columns of a Row. It’s just choose all data
in a column and convert it to an Array. Do you understand my mean ?
In Chinese
我是想基于这个列名把这一列中的所有数据都选出来,然后放到数组里面去。
发件人: Jeff Zhang [mailto:zjf...@gmail.com]
发送时间: 2015年12月25日 15:39
收件人: zml张明磊
抄送:
Is there any formula with which I could determine Shuffle Write before
execution?
For example, in Sort Merge join in the stage in which the first table is
being loaded the shuffle write is 429.2 MB. The table is 5.5G in the HDFS
with block size 128 MB. Consequently is being loaded in 45
not that likely to get an answer as it’s really a support call, not a
bug/task.
The first question is about proper documentation of all the stuff we’ve
been discussing in this thread, so one would think that’s a valid task. It
doesn’t seem right that closer.lua, for example, is undocumented.
+1
Tested on HDP 2.3, YARN cluster mode, spark-shell
On Wed, Dec 23, 2015 at 6:14 AM, Allen Zhang wrote:
>
> +1 (non-binding)
>
> I have just tarball a new binary and tested am.nodelabelexpression and
> executor.nodelabelexpression manully, result is expected.
>
>
>
>
>
On 24 Dec 2015, at 05:59, Nicholas Chammas
> wrote:
FYI: I opened an INFRA ticket with questions about how best to use the Apache
mirror network.
https://issues.apache.org/jira/browse/INFRA-10999
Nick
not that likely to get an
Hi,
While reviewing DAGScheduler, and where failedStages internal
collection of failed staged ready for resubmission is used, I came
across a question for which I'm looking an answer to. Any hints would
be greatly appreciated.
When resubmitFailedStages [1] is executed, and there are any failed
getMissingParentStages(stage) would be called for the stage (being
re-submitted)
If there is no missing parents, submitMissingTasks() would be called.
If there is missing parent(s), the parent would go through the same flow.
I don't see issue in this part of the code.
Cheers
On Thu, Dec 24,
12 matches
Mail list logo