I dumped the trees in the random forest model, and occasionally saw a leaf
node with strange stats:
- pred=1.00 prob=0.80 imp=-1.00
I am running LogisticRegressionWithLBFGS. I got these lines on my console:
2015-03-12 17:38:03,897 ERROR breeze.optimize.StrongWolfeLineSearch |
Encountered bad values in function evaluation. Decreasing step size to 0.5
2015-03-12 17:38:03,967 ERROR breeze.optimize.StrongWolfeLineSearch |
I am running LogisticRegressionWithLBFGS. I got these lines on my console:
2015-03-12 17:38:03,897 ERROR breeze.optimize.StrongWolfeLineSearch |
Encountered bad values in function evaluation. Decreasing step size to 0.5
2015-03-12 17:38:03,967 ERROR breeze.optimize.StrongWolfeLineSearch |
When I run Spark 1.2.1, I found these display that wasn't in the previous
releases:
[Stage 12:= (6 + 1) /
16]
[Stage 12:(8 + 1) /
16]
[Stage 12:==
I wonder what algorithm is used to implement sortByKey? I assume it is some
O(n*log(n)) parallelized on x number of nodes, right?
Then, what size of data would make it worthwhile to use sortByKey on
multiple processors rather than use standard Scala sort functions on a
single processor
My code seemed deadlock when I tried to do this:
object MoreRdd extends Serializable {
def apply(i: Int) = {
val rdd2 = sc.parallelize(0 to 10)
rdd2.map(j = i*10 + j).collect
}
}
val rdd1 = sc.parallelize(0 to 10)
val y = rdd1.map(i =
I didn't know this restriction. Thank you.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Creating-an-RDD-in-another-RDD-causes-deadlock-tp13302p13304.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
I need some advices regarding how data are stored in an RDD. I have millions
of records, called Measures. They are bucketed with keys of String type.
I wonder if I need to store them as RDD[(String, Measure)] or RDD[(String,
Iterable[Measure])], and why?
Data in each bucket are not related
It would be nice if an RDD that was massaged by OrderedRDDFunction could know
its neighbors.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element-in-a-sorted-RDD-tp12621p12664.html
Sent from the Apache Spark User List mailing
I have an RDD containing elements sorted in certain order. I would like to
map over the elements knowing the values of their respective previous and
next elements.
With regular List, I used to do this: (input is a List below)
// The first of the previous measures and the last of the next
One way is to do zipWithIndex on the RDD. Then use the index as a key. Add
or subtract 1 for previous or next element. Then use cogroup or join to
bind them together.
val idx = input.zipWithIndex
val previous = idx.map(x = (x._2+1, x._1))
val current = idx.map(x = (x._2, x._1))
val next =
I restarted Spark Master with spark-0.9.1 and SparkR was able to communicate
with the Master. I am using the latest SparkR pkg-e1f95b6. Maybe it has
problem communicating to Spark 1.0.0?
--
View this message in context:
I tried installing the latest Spark 1.0.1 and SparkR couldn't find the master
either. I restarted with Spark 0.9.1 and SparkR was able to find the
master. So, there seemed to be something that changed after Spark 1.0.0.
--
View this message in context:
Neither do they work in new 1.0.1 either
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/cores-option-in-spark-shell-tp6809p9690.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Not sure that was what I want. I tried to run Spark Shell on a machine other
than the master and got the same error. The 192 was suppose to be a
simple shell script change that alters SPARK_HOME before submitting jobs.
Too bad it wasn't there anymore.
The build described in the pull request
I have a cluster running. I was able to run Spark Shell and submit programs.
But when I tried to use SparkR, I got these errors:
wifi-orcus:sparkR cwang$ MASTER=spark://wifi-orcus.dhcp.carrieriq.com:7077
sparkR
R version 3.1.0 (2014-04-10) -- Spring Dance
Copyright (C) 2014 The R Foundation
Andrew,
Thanks for replying. I did the following and the result was still the same.
1. Added spark.home /root/spark-1.0.0 to local conf/spark-defaults.conf,
where /root was the place in the cluster where I put Spark.
2. Ran bin/spark-shell --master
17 matches
Mail list logo