UNSUBSCRIBE

2015-12-11 Thread williamtellme123



UNSUBSCRIBE

2015-12-11 Thread williamtellme123



UNSUBSCRIBE

2017-01-12 Thread williamtellme123
 

 

From: Harjit Singh [mailto:harjit.si...@deciphernow.com] 
Sent: Tuesday, April 26, 2016 3:11 PM
To: user@spark.apache.org
Subject: test

 

 

 

 

 

 



unsubscribe

2017-05-10 Thread williamtellme123
 

 

From: Aaron Jackson [mailto:ajack...@pobox.com] 
Sent: Tuesday, July 19, 2016 7:17 PM
To: user 
Subject: Heavy Stage Concentration - Ends With Failure

 

Hi,

 

I have a cluster with 15 nodes of which 5 are HDFS nodes.  I kick off a job 
that creates some 120 stages.  Eventually, the active and pending stages reduce 
down to a small bottleneck and it never fails... the tasks associated with the 
10 (or so) running tasks are always allocated to the same executor on the same 
host.

 

Sooner or later, it runs out of memory ... or some other resource.  It falls 
over and then they tasks are reallocated to another executor.

 

Why do we see such heavy concentration of tasks onto a single executor when 
other executors are free?  Were the tasks assigned to an executor when the job 
was decomposed into stages?



unsubscribe

2017-05-10 Thread williamtellme123
unsubscribe

 

From: Aaron Perrin [mailto:aper...@gravyanalytics.com] 
Sent: Tuesday, January 31, 2017 9:42 AM
To: user @spark 
Subject: Multiple quantile calculations

 

I want to calculate quantiles on two different columns.  I know that I can 
calculate them with two separate operations. However, for performance reasons, 
I'd like to calculate both with one operation. 

 

Is this possible to do this with the Dataset API? I'm assuming that it isn't. 
But, if it isn't, is it possible to calculate both in one pass, assuming that I 
made some code changes? I briefly looked at the approxQuantile code, but I 
haven't dug into the algorithm.

 



unsubscribe

2017-05-20 Thread williamtellme123
unsubscribe

From: Abir Chakraborty [mailto:abi...@247-inc.com] 
Sent: Saturday, May 20, 2017 1:29 AM
To: user@spark.apache.org
Subject: unsubscribe

 

 



user-unsubscr...@spark.apache.org

2017-05-23 Thread williamtellme123
 

 

From: Abir Chakraborty [mailto:abi...@247-inc.com] 
Sent: Sunday, May 21, 2017 4:17 AM
To: user@spark.apache.org
Subject: unsubscribe

 

unsubscribe

 

 

 



user-unsubscr...@spark.apache.org

2017-05-23 Thread williamtellme123
user-unsubscr...@spark.apache.org

 

From: 萝卜丝炒饭 [mailto:1427357...@qq.com] 
Sent: Sunday, May 21, 2017 8:15 PM
To: user 
Subject: Are tachyon and akka removed from 2.1.1 please

 

HI all,

Iread some paper about source code, the paper base on version 1.2.  they
refer the tachyon and akka.  When i read the 2.1code. I can not find the
code abiut akka and tachyon.

 

Are tachyon and akka removed from 2.1.1  please



user-unsubscr...@spark.apache.org

2017-05-23 Thread williamtellme123
 

 

From: Bibudh Lahiri [mailto:bibudhlah...@gmail.com] 
Sent: Sunday, May 21, 2017 9:34 AM
To: user 
Subject: unsubscribe

 

unsubscribe  



user-unsubscr...@spark.apache.org

2017-05-23 Thread williamtellme123
 

 

From: Arun [mailto:arunbm...@gmail.com] 
Sent: Saturday, May 20, 2017 9:48 PM
To: user@spark.apache.org
Subject: Rmse recomender system

 

 

hi all..

 

I am new to machine learning.

 

i am working on recomender system. for training dataset RMSE is  0.08  while on 
test data its is 2.345

 

whats conclusion and what steps can i take to improve

 

 

 

Sent from Samsung tablet



user-unsubscr...@spark.apache.org

2017-05-26 Thread williamtellme123
user-unsubscr...@spark.apache.org

 

From: ANEESH .V.V [mailto:aneeshnair.ku...@gmail.com] 
Sent: Friday, May 26, 2017 1:50 AM
To: user@spark.apache.org
Subject: unsubscribe

 

unsubscribe



user-unsubscr...@spark.apache.org

2017-05-25 Thread williamtellme123
 

 

From: Steffen Schmitz [mailto:steffenschm...@hotmail.de] 
Sent: Thursday, May 25, 2017 3:34 AM
To: ramnavan 
Cc: user@spark.apache.org
Subject: Re: Questions regarding Jobs, Stages and Caching

 

 



user-unsubscr...@spark.apache.org

2017-06-07 Thread williamtellme123
user-unsubscr...@spark.apache.org

 

From: 颜发才(Yan Facai) [mailto:facai@gmail.com] 
Sent: Wednesday, June 7, 2017 4:24 AM
To: kundan kumar 
Cc: spark users 
Subject: Re: Convert the feature vector to raw data

 

Hi, kumar.

How about removing the `select` in your code?

namely,

Dataset result = model.transform(testData);

result.show(1000, false);





 

On Wed, Jun 7, 2017 at 5:00 PM, kundan kumar  > wrote:

I am using 

 

Dataset result = model.transform(testData).select("probability", 
"label","features");

 result.show(1000, false);

 

In this case the feature vector is being printed as output. Is there a way that 
my original raw data gets printed instead of the feature vector OR is there a 
way to reverse extract my raw data from the feature vector. All of the features 
that my dataset have is categorical in nature.

 

Thanks,

Kundan

 



user-unsubscr...@spark.apache.org

2017-06-07 Thread williamtellme123
user-unsubscr...@spark.apache.org

 

From: kundan kumar [mailto:iitr.kun...@gmail.com] 
Sent: Wednesday, June 7, 2017 5:15 AM
To: 颜发才(Yan Facai) 
Cc: spark users 
Subject: Re: Convert the feature vector to raw data

 

Hi Yan, 

 

This doesnt work.

 

thanks,

kundan

 

On Wed, Jun 7, 2017 at 2:53 PM, 颜发才(Yan Facai)  > wrote:

Hi, kumar.

How about removing the `select` in your code?

namely,

Dataset result = model.transform(testData);

result.show(1000, false);





 

On Wed, Jun 7, 2017 at 5:00 PM, kundan kumar  > wrote:

I am using 

 

Dataset result = model.transform(testData).select("probability", 
"label","features");

 result.show(1000, false);

 

In this case the feature vector is being printed as output. Is there a way that 
my original raw data gets printed instead of the feature vector OR is there a 
way to reverse extract my raw data from the feature vector. All of the features 
that my dataset have is categorical in nature.

 

Thanks,

Kundan

 

 



user-unsubscr...@spark.apache.org

2017-06-07 Thread williamtellme123
user-unsubscr...@spark.apache.org

user-unsubscr...@spark.apache.org

From: kundan kumar [mailto:iitr.kun...@gmail.com] 
Sent: Wednesday, June 7, 2017 4:01 AM
To: spark users 
Subject: Convert the feature vector to raw data

 

I am using 

 

Dataset result = model.transform(testData).select("probability", 
"label","features");

 result.show(1000, false);

 

In this case the feature vector is being printed as output. Is there a way that 
my original raw data gets printed instead of the feature vector OR is there a 
way to reverse extract my raw data from the feature vector. All of the features 
that my dataset have is categorical in nature.

 

Thanks,

Kundan



user-unsubscr...@spark.apache.org

2017-05-30 Thread williamtellme123
 

 

From: Joel D [mailto:games2013@gmail.com] 
Sent: Monday, May 29, 2017 9:04 PM
To: user@spark.apache.org
Subject: Schema Evolution Parquet vs Avro

 

Hi,

 

We are trying to come up with the best storage format for handling schema 
changes in ingested data.

 

We noticed that both avro and parquet allows one to select based on column name 
instead of the data index/position of data. However, we are inclined towards 
parquet for better read performance since it's columnar and we will be 
selecting few columns instead of all. Data will be processed and saved to 
partitions on which we will have hive external tables.

 

Will parquet be able to handle the following:

- Column renaming from between data

- Column removal from between

- DataType change of existing column (int to bigint should be allowed, right?)

 

Please advise. 

 

Thanks,

Sam