Hi
I have some basic doubt regarding spark R.
1. can we run R codes in spark using sparkR or some spark functionalities
that are executed in spark through R.
--
Thanks and Regards
Arun
Hi
Is there any predefined method to calculate histogram bins and frequency in
spark. Currently I take range and find bins then count frequency using SQL
query.
Is there any better way
How to calculate percentile in spark 1.6 ?
--
Thanks and Regards
Arun
nk this may be some permission issue. Check your spark conf for
> hadoop related.
>
> --
> fightf...@163.com
>
>
> *From:* Arunkumar Pillai <arunkumar1...@gmail.com>
> *Date:* 2016-02-23 14:08
> *To:* user <user@spark.apache.org>
>
Hi When i try to start spark-shell
I'm getting following error
Exception in thread "main" java.lang.RuntimeException:
java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at
Hi
I'm trying to build logistic regression using ML Pipeline
val lr = new LogisticRegression()
lr.setFitIntercept(true)
lr.setMaxIter(100)
val model = lr.fit(data)
println(model.summary)
I'm getting coefficients but not able to get the predicted and probability
values.
hi
I have a dataframe df and i use df.decribe() to get the stats value. but
not able to parse and extract all the individual information. Please help
--
Thanks and Regards
Arun
Hi
I'm using sql query find the percentile value. Is there any pre defined
functions for percentile calculation
--
Thanks and Regards
Arun
Hi
I have observed that kurtosis values coming from apache spark has a
difference of 3.
The value coming from excel and in R as same values(11.333) but the
kurtosis value coming from spark1.6 differs by 3 (8.333).
Please let me know if I'm doing something wrong.
I'm executing via
Hi
I'm currently using query
sqlContext.sql("SELECT MAX(variablesArray) FROM " + tableName)
to extract mean max min.
is there any better optimized way ?
In the example i saw df.groupBy("key").agg(skewness("a"), kurtosis("a"))
But i don't have key anywhere in the data.
How to extract the
Hi
Currently after creating a dataframe i'm queryingmax max min mean it to
get result.
sqlContext.sql("SELECT MAX(variablesArray) FROM " + tableName)
Is this an optimized way?
I'm not able to find the all stats like min max mean variance skewness
kurtosis directly from a dataframe
Please
Hi
Is it possible to get AIC value in Linear Regression using ml pipeline ? Is
so please help me
--
Thanks and Regards
Arun
Hi
I need help page for Logistics Regression in ML pipeline. when i browsed
I'm getting the 1.6 help please help me.
--
Thanks and Regards
Arun
Hi
Is there any functions to find distinct count of all the variables in
dataframe.
val sc = new SparkContext(conf) // spark context
val options = Map("header" -> "true", "delimiter" -> delimiter,
"inferSchema" -> "true")
val sqlContext = new org.apache.spark.sql.SQLContext(sc) // sql context
atasetDF.select(countDistinct(col1, col2, col3, ...)) or
> approxCountDistinct for a approximate result.
>
> 2016-01-05 17:11 GMT+08:00 Arunkumar Pillai <arunkumar1...@gmail.com>:
>
>> Hi
>>
>> Is there any functions to find distinct count of all the variables in
;:
>
>> keyStoneML could be an alternative.
>>
>> Ardo.
>>
>> On 03 Jan 2016, at 15:50, Arunkumar Pillai <arunkumar1...@gmail.com>
>> wrote:
>>
>> Is there any road map for glm in pipeline?
>>
>>
>
--
Thanks and Regards
Arun
Is there any road map for glm in pipeline?
, meanSquaredError,
> rootMeanSquaredError and r2 as metrics of LinearRegression.
> Although actually you can get SSerr, SStot and SSreg from the composition
> of above metrics.
>
> Yanbo
>
>
> 2015-12-22 12:23 GMT+08:00 Arunkumar Pillai <arunkumar1...@gmail.com>:
&g
Hi
I'm using Linear Regression using ml package
I'm able to see SSerr SStot and SSreg from
val model = lr.fit(dat1)
model.summary.metric
But this metric is not accessible. It would be good if we can get those
values.
Any suggestion
--
Thanks and Regards
Arun
Hi
I'm trying to use Linear Regression from ml library
but the problem is the independent variable should be a vector.
My code snippet is as as follows
var dataDF = sqlContext.emptyDataFrame
dataDF = sqlContext.sql("SELECT "+
dependentVariable+","+independentVariables +" FROM " +
Hi
I'm using ml.LinearRegession package
How to get estimates and standard Error for the coefficient
PFB the code snippet
val lr = new LinearRegression()
lr.setMaxIter(10)
.setRegParam(0.01)
.setFitIntercept(true)
val model= lr.fit(test)
val estimates = model.summary
Hi
I want to find matrix inverse of (XTranspose * X). PFB my code.
This code does not work for even slight larger dataset. Please help me if
the approach is correct.
val sqlQuery = "SELECT column1,column2 ,column3 FROM " + tableName
val matrixDF` = sqlContext.sql(sqlQuery)
var
How to get intercept in Linear Regression Model?
LinearRegressionWithSGD.train(parsedData, numIterations)
--
Thanks and Regards
Arun
Hi
The Regression algorithm in the MLlib is using Loss function to calculate
the regression estimates and R is using matrix method to calculate the
estimates.
I see some difference between the results of Both Spark and R.
I was using the following class
LinearRegressionWithSGD.train(parsedData,
Hi
I need an exmaple for Linear Regression using OLS
val data = sc.textFile("data/mllib/ridge-data/lpsa.data")
val parsedData = data.map { line =>
val parts = line.split(',')
LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split('
').map(_.toDouble)))
}.cache()
// Building the model
Hi
I need to find inverse (X(Transpose) * X) matrix. I have found X transpose
and matrix multiplication.
is there any way to find to find the inverse of the matrix.
--
Thanks and Regards
Arun
Hi
I'm started using apache spark 1.5.2 version. I'm able to see GLM using
SparkR but it is not there in MLlib. Is there any plans or road map for
that
--
Thanks and Regards
Arun
27 matches
Mail list logo