We have had excellent results operating on RDDs using Java 8 with Lambdas.
It’s slightly more verbose than Scala, but I haven’t found this an issue,
and haven’t missed any functionality.
The new DataFrame API makes the Spark platform even more language agnostic.
Tristan
On 15 July 2015 at 06:40,
You could use a map() operation, but the easiest way is probably to just
call values() method on the JavaPairRDD to get a JavaRDD.
See this link:
https://www.safaribooksonline.com/library/view/learning-spark/9781449359034/ch04.html
Tristan
On 13 May 2015 at 23:12, Yasemin Kaya wrote:
> Hi,
ahab wrote:
>
>> Thanks Tristan for sharing this. Actually this happens when I am reading
>> a csv file of 3.5 GB.
>>
>> best,
>> /Shahab
>>
>>
>>
>> On Tue, May 5, 2015 at 9:15 AM, Tristan Blakers
>> wrote:
>>
>>> Hi S
Hi Shahab,
I’ve seen exceptions very similar to this (it also manifests as negative
array size exception), and I believe it’s a really bug in Kryo.
See this thread:
http://mail-archives.us.apache.org/mod_mbox/spark-user/201502.mbox/%3ccag02ijuw3oqbi2t8acb5nlrvxso2xmas1qrqd_4fq1tgvvj...@mail.gmail
iced from browsing the code
> is that even when writing to a stream, kryo has an internal buffer of
> limited size, which is periodically flushes. Perhaps we can get kryo to
> turn off that buffer, or we can at least get it to flush more often.)
>
> thanks,
> Imran
>
>
> On
I get the same exception simply by doing a large broadcast of about 6GB.
Note that I’m broadcasting a small number (~3m) of fat objects. There’s
plenty of free RAM. This and related kryo exceptions seem to crop-up
whenever an object graph of more than a couple of GB gets passed around.
at
A search shows several historical threads for similar Kryo issues, but none
seem to have a definitive solution. Currently using Spark 1.2.0.
While collecting/broadcasting/grouping moderately sized data sets (~500MB -
1GB), I regularly see exceptions such as the one below.
I’ve tried increasing th
ally?)
>
> On Thu, Dec 18, 2014 at 10:42 AM, Tristan Blakers
> wrote:
> > Suspected the same thing, but because the underlying data classes are
> > deserialised by Avro I think they have to be mutable as you need to
> provide
> > the no-args constructor with settable fie
at 21:25, Sean Owen wrote:
>
> It sounds a lot like your values are mutable classes and you are
> mutating or reusing them somewhere? It might work until you actually
> try to materialize them all and find many point to the same object.
>
> On Thu, Dec 18, 2014 at 10:06 A
Hi,
I’m getting some seemingly invalid results when I collect an RDD. This is
happening in both Spark 1.1.0 and 1.2.0, using Java8 on Mac.
See the following code snippet:
JavaRDD rdd= pairRDD.values();
rdd.foreach( e -> System.out.println ( "RDD Foreach: " + e ) );
rdd.collect().forEach( e -> Sy
10 matches
Mail list logo