Re: Measure performance time in some spark transformations.

2018-05-13 Thread Jörn Franke
Can’t you find this in the Spark UI or timeline server? > On 13. May 2018, at 00:31, Guillermo Ortiz Fernández > wrote: > > I want to measure how long it takes some different transformations in Spark > as map, joinWithCassandraTable and so on. Which one is the

Measure performance time in some spark transformations.

2018-05-12 Thread Guillermo Ortiz Fernández
I want to measure how long it takes some different transformations in Spark as map, joinWithCassandraTable and so on. Which one is the best aproximation to do it? def time[R](block: => R): R = { val t0 = System.nanoTime() val result = block val t1 = System.nanoTime()

Weird experience Hive with Spark Transformations

2017-01-16 Thread Chetan Khatri
Hello, I have following services are configured and installed successfully: Hadoop 2.7.x Spark 2.0.x HBase 1.2.4 Hive 1.2.1 *Installation Directories:* /usr/local/hadoop /usr/local/spark /usr/local/hbase *Hive Environment variables:* #HIVE VARIABLES START export HIVE_HOME=/usr/local/hive

Re: Spark transformations

2016-09-12 Thread Thunder Stumpges
Yep, totally with you on this. None of it is ideal but doesn't sound like there will be any changes coming to the visibility of ml supporting classes. -Thunder On Mon, Sep 12, 2016 at 10:10 AM janardhan shetty wrote: > Thanks Thunder. To copy the code base is difficult

Re: Spark transformations

2016-09-12 Thread janardhan shetty
Thanks Thunder. To copy the code base is difficult since we need to copy in entirety or transitive dependency files as well. If we need to do complex operations of taking a column as a whole instead of each element in a row is not possible as of now. Trying to find few pointers to easily solve

Re: Spark transformations

2016-09-12 Thread Thunder Stumpges
Hi Janardhan, I have run into similar issues and asked similar questions. I also ran into many problems with private code when trying to write my own Model/Transformer/Estimator. (you might be able to find my question to the group regarding this, I can't really tell if my emails are getting

Re: Spark transformations

2016-09-06 Thread janardhan shetty
Noticed few things about Spark transformers just wanted to be clear. Unary transformer: createTransformFunc: IN => OUT = { *item* => } Here *item *is single element and *NOT* entire column. I would like to get the number of elements in that particular column. Since there is *no forward

Re: Spark transformations

2016-09-04 Thread janardhan shetty
In scala Spark ML Dataframes. On Sun, Sep 4, 2016 at 9:16 AM, Somasundaram Sekar < somasundar.se...@tigeranalytics.com> wrote: > Can you try this > > https://www.linkedin.com/pulse/hive-functions-udfudaf- > udtf-examples-gaurav-singh > > On 4 Sep 2016 9:38 pm, "janardhan shetty"

Re: Spark transformations

2016-09-04 Thread Somasundaram Sekar
Can you try this https://www.linkedin.com/pulse/hive-functions-udfudaf-udtf-examples-gaurav-singh On 4 Sep 2016 9:38 pm, "janardhan shetty" wrote: > Hi, > > Is there any chance that we can send entire multiple columns to an udf and > generate a new column for Spark ML.

Spark transformations

2016-09-04 Thread janardhan shetty
Hi, Is there any chance that we can send entire multiple columns to an udf and generate a new column for Spark ML. I see similar approach as VectorAssembler but not able to use few classes /traitslike HasInputCols, HasOutputCol, DefaultParamsWritable since they are private. Any leads/examples is

Re: Calculating Min and Max Values using Spark Transformations?

2015-08-30 Thread Ashen Weerathunga
? Yong -- Subject: Re: Calculating Min and Max Values using Spark Transformations? To: as...@wso2.com CC: user@spark.apache.org From: jfc...@us.ibm.com Date: Fri, 28 Aug 2015 09:28:43 -0700 If you already loaded csv data into a dataframe, why not register

Re: Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread Asher Krim
in this JavaRDDdouble[] ? Or Is there any way to access the array if I store max and min values to a array inside the spark transformation class? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Calculating-Min-and-Max-Values-using-Spark-Transformations

Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread ashensw
: http://apache-spark-user-list.1001560.n3.nabble.com/Calculating-Min-and-Max-Values-using-Spark-Transformations-tp24491.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user

Re: Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread Burak Yavuz
-- Subject: Re: Calculating Min and Max Values using Spark Transformations? To: as...@wso2.com CC: user@spark.apache.org From: jfc...@us.ibm.com Date: Fri, 28 Aug 2015 09:28:43 -0700 If you already loaded csv data into a dataframe, why not register it as a table, and use

Re: Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread Jesse F Chen
From: ashensw as...@wso2.com To: user@spark.apache.org Date: 08/28/2015 05:40 AM Subject:Calculating Min and Max Values using Spark Transformations? Hi

RE: Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread java8964
Or RDD.max() and RDD.min() won't work for you? Yong Subject: Re: Calculating Min and Max Values using Spark Transformations? To: as...@wso2.com CC: user@spark.apache.org From: jfc...@us.ibm.com Date: Fri, 28 Aug 2015 09:28:43 -0700 If you already loaded csv data into a dataframe, why

Re: Calculating Min and Max Values using Spark Transformations?

2015-08-28 Thread Alexey Grishchenko
-null and non-NA elements as well. Burak On Fri, Aug 28, 2015 at 10:09 AM, java8964 java8...@hotmail.com wrote: Or RDD.max() and RDD.min() won't work for you? Yong -- Subject: Re: Calculating Min and Max Values using Spark Transformations? To: as...@wso2.com CC

Unit Testing Spark Transformations/Actions

2015-06-16 Thread Mark Tse
Hi there, I am looking to use Mockito to mock out some functionality while unit testing a Spark application. I currently have code that happily runs on a cluster, but fails when I try to run unit tests against it, throwing a SparkException: org.apache.spark.SparkException: Job aborted due to