Ravi did your issue ever get solved for this?
I think i've been hitting the same thing, it looks like
the spark.sql.autoBroadcastJoinThreshold stuff isn't kicking in as
expected, if I set that to -1 then the computation proceeds successfully.
On Tue, Jun 14, 2016 at 12:28 AM, Ravi Aggarwal
object MyCoreNLP {
@transient lazy val coreNLP = new coreNLP()
}
and then refer to it from your map/reduce/map partitions or that it should
be fine (presuming its thread safe), it will only be initialized once per
classloader per jvm
On Mon, Nov 24, 2014 at 7:58 AM, Evan Sparks
Whats the error with the 2.10 version of algebird?
On Thu, Oct 30, 2014 at 12:49 AM, thadude ohpre...@yahoo.com wrote:
I've tried:
. /bin/spark-shell --jars algebird-core_2.10-0.8.1.jar
scala import com.twitter.algebird._
import com.twitter.algebird._
scala import HyperLogLog._
import
Algebird 0.8.0 has 2.11 support if you want to run in a 2.11 env.
On Thu, Oct 30, 2014 at 10:08 AM, Buntu Dev buntu...@gmail.com wrote:
Thanks.. I was using Scala 2.11.1 and was able to
use algebird-core_2.10-0.1.11.jar with spark-shell.
On Thu, Oct 30, 2014 at 8:22 AM, Ian O'Connell i
I would guess the field serializer is having issues being able to
reconstruct the class again, its pretty much best effort.
Is this an intermediate type?
On Thu, Sep 25, 2014 at 2:12 PM, Sandy Ryza sandy.r...@cloudera.com wrote:
We're running into an error (below) when trying to read spilled
Mmm how many days worth of data/how deep is your data nesting?
I suspect your running into a current issue with parquet (a fix is in
master but I don't believe released yet..). It reads all the metadata to
the submitter node as part of scheduling the job. This can cause long start
times(timeouts
Depending on your requirements when doing hourly metrics calculating
distinct cardinality, a much more scalable method would be to use a hyper
log log data structure.
a scala impl people have used with spark would be
I think the distinction there might be they never said they ran that code
under CDH5, just that spark supports it and spark runs under CDH5. Not that
you can use these features while running under CDH5.
They could use mesos or the standalone scheduler to run them
On Tue, May 6, 2014 at 6:16 AM,
A mutable map in an object should do what your looking for then I believe.
You just reference the object as an object in your closure so it won't be
swept up when your closure is serialized and you can reference variables of
the object on the remote host then. e.g.:
object MyObject {
val mmap =
I'm guessing the other result was wrong, or just never evaluated here. The
RDD transforms being lazy may have let it be expressed, but it wouldn't
work. Nested RDD's are not supported.
On Mon, Mar 17, 2014 at 4:01 PM, anny9699 anny9...@gmail.com wrote:
Hi Andrew,
Thanks for the reply.
10 matches
Mail list logo