The SparkConf doesn't allow you to set arbitrary variables. You can use
SparkContext's HadoopRDD and create a JobConf (with whatever variables you
want), and then grab them out of the JobConf in your RecordReader.
On Sun, Feb 22, 2015 at 4:28 PM, hnahak harihar1...@gmail.com wrote:
Hi,
I
Spark gives you four of the classical collectives: broadcast, reduce,
scatter, and gather. There are also a few additional primitives, mostly
based on a join. Spark is certainly less optimized than MPI for these, but
maybe that isn't such a big deal. Spark has one theoretical disadvantage
I've done some comparisons with my own implementation of TRON on Spark.
From a distributed computing perspective, it does 2x more local work per
iteration than LBFGS, so the parallel isoefficiency is improved slightly.
I think the truncated Newton solver holds some potential because there
have
As to your last line: I've used RDD zipping to avoid GC since MyBaseData is
large and doesn't change. I think this is a very good solution to what is
being asked for.
On Mon, Apr 28, 2014 at 10:44 AM, Ian O'Connell i...@ianoconnell.com wrote:
A mutable map in an object should do what your
I'm not sure what I said came through. RDD zip is not hacky at all, as it
only depends on a user not changing the partitioning. Basically, you would
keep your losses as an RDD[Double] and zip whose with the RDD of examples,
and update the losses. You're doing a copy (and GC) on the RDD of
my iPhone
On Apr 28, 2014, at 9:45 AM, Tom Vacek minnesota...@gmail.com wrote:
I'm not sure what I said came through. RDD zip is not hacky at all, as it
only depends on a user not changing the partitioning. Basically, you would
keep your losses as an RDD[Double] and zip whose with the RDD
Ian, I tried playing with your suggestion, but I get a task not
serializable error (and some obvious things didn't fix it). Can you get
that working?
On Mon, Apr 28, 2014 at 10:58 AM, Tom Vacek minnesota...@gmail.com wrote:
As to your last line: I've used RDD zipping to avoid GC since
on loss RDD (copy) ?
Chester
Sent from my iPhone
On Apr 28, 2014, at 9:45 AM, Tom Vacek minnesota...@gmail.com wrote:
I'm not sure what I said came through. RDD zip is not hacky at all, as
it only depends on a user not changing the partitioning. Basically, you
would keep your losses as an RDD
Here are some out-of-the-box ideas: If the elements lie in a fairly small
range and/or you're willing to work with limited precision, you could use
counting sort. Moreover, you could iteratively find the median using
bisection, which would be associative and commutative. It's easy to think
of
Thomson Reuters is looking for a graduate (or possibly advanced
undergraduate) summer intern in Eagan, MN. This is a chance to work on an
innovative project exploring how big data sets can be used by professionals
such as lawyers, scientists and journalists. If you're subscribed to this
mailing
10 matches
Mail list logo