date:20151224

答复: How can I get the column data based on specific column name and then stored these data in array or list ?

2015-12-24 Thread zml张明磊

Thanks, Jeff. It’s not choose some columns of a Row. It’s just choose all data in a column and convert it to an Array. Do you understand my mean ? In Chinese 我是想基于这个列名把这一列中的所有数据都选出来，然后放到数组里面去。发件人: Jeff Zhang [mailto:zjf...@gmail.com] 发送时间: 2015年12月25日 15:39 收件人: zml张明磊抄送: dev@spark.apache.org

Re: latest Spark build error

2015-12-24 Thread salexln

Updating Maven version to 3.3.9 solved the issue Thanks everyone! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/latest-Spark-build-error-tp15782p15787.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: How can I get the column data based on specific column name and then stored these data in array or list ?

2015-12-24 Thread Jeff Zhang

Not sure what you mean. Do you want to choose some columns of a Row and convert it to an Arrray ? On Fri, Dec 25, 2015 at 3:35 PM, zml张明磊 wrote: > > > Hi, > > > >I am a new to Scala and Spark and trying to find relative API in > DataFrame > to solve my problem as title described. Howeve

How can I get the column data based on specific column name and then stored these data in array or list ?

2015-12-24 Thread zml张明磊

Hi, I am a new to Scala and Spark and trying to find relative API in DataFrame to solve my problem as title described. However, I just only find this API DataFrame.col(colName : String) : Column which returns an object of Column. Not the content. If only DataFrame support such API which

Re: latest Spark build error

2015-12-24 Thread Kazuaki Ishizaki

This is because to build Spark requires maven 3.3.3 or later. http://spark.apache.org/docs/latest/building-spark.html Regards, Kazuaki Ishizaki From: salexln To: dev@spark.apache.org Date: 2015/12/25 15:52 Subject:latest Spark build error Hi all, I'm getting build error wh

latest Spark build error

2015-12-24 Thread salexln

Hi all, I'm getting build error when trying to build a clean version of latest Spark. I did the following 1) git clone https://github.com/apache/spark.git 2) build/mvn -DskipTests clean package But I get the following error: Spark Project Parent POM .. FAILURE [2.338s]

Re: Shuffle Write Size

2015-12-24 Thread Xingchi Wang

I think shuffle write size not dependency on the your data, but on the join operation, maybe your join action, don't need to shuffle more data, because the table data has already on its partition, so it not need shuffle write, is it possible? 2015-12-25 0:53 GMT+08:00 gsvic : > Is there any formu

Re: Downloading Hadoop from s3://spark-related-packages/

2015-12-24 Thread Nicholas Chammas

not that likely to get an answer as it’s really a support call, not a bug/task. The first question is about proper documentation of all the stuff we’ve been discussing in this thread, so one would think that’s a valid task. It doesn’t seem right that closer.lua, for example, is undocumented. Eithe

Shuffle Write Size

2015-12-24 Thread gsvic

Is there any formula with which I could determine Shuffle Write before execution? For example, in Sort Merge join in the stage in which the first table is being loaded the shuffle write is 429.2 MB. The table is 5.5G in the HDFS with block size 128 MB. Consequently is being loaded in 45 tasks/part

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-24 Thread Vinay Shukla

+1 Tested on HDP 2.3, YARN cluster mode, spark-shell On Wed, Dec 23, 2015 at 6:14 AM, Allen Zhang wrote: > > +1 (non-binding) > > I have just tarball a new binary and tested am.nodelabelexpression and > executor.nodelabelexpression manully, result is expected. > > > > > At 2015-12-23 21:44:08, "

Re: [DAGScheduler] resubmitFailedStages, failedStages.clear() and submitStage

2015-12-24 Thread Ted Yu

getMissingParentStages(stage) would be called for the stage (being re-submitted) If there is no missing parents, submitMissingTasks() would be called. If there is missing parent(s), the parent would go through the same flow. I don't see issue in this part of the code. Cheers On Thu, Dec 24, 201

[DAGScheduler] resubmitFailedStages, failedStages.clear() and submitStage

2015-12-24 Thread Jacek Laskowski

Hi, While reviewing DAGScheduler, and where failedStages internal collection of failed staged ready for resubmission is used, I came across a question for which I'm looking an answer to. Any hints would be greatly appreciated. When resubmitFailedStages [1] is executed, and there are any failed st

Re: Downloading Hadoop from s3://spark-related-packages/

2015-12-24 Thread Steve Loughran

On 24 Dec 2015, at 05:59, Nicholas Chammas mailto:nicholas.cham...@gmail.com>> wrote: FYI: I opened an INFRA ticket with questions about how best to use the Apache mirror network. https://issues.apache.org/jira/browse/INFRA-10999 Nick not that likely to get an answer as it's really a suppor

答复: How can I get the column data based on specific column name and then stored these data in array or list ?

Re: latest Spark build error

Re: How can I get the column data based on specific column name and then stored these data in array or list ?

How can I get the column data based on specific column name and then stored these data in array or list ?

Re: latest Spark build error

latest Spark build error

Re: Shuffle Write Size

Re: Downloading Hadoop from s3://spark-related-packages/

Shuffle Write Size

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

Re: [DAGScheduler] resubmitFailedStages, failedStages.clear() and submitStage

[DAGScheduler] resubmitFailedStages, failedStages.clear() and submitStage

Re: Downloading Hadoop from s3://spark-related-packages/

13 matches

Site Navigation

Mail list logo

Footer information