It looks because different snappy version, if you disable compress or switch to
lz4, the size is no different.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月10日, at 下午6:13, chris wrote:
>
> Hello,
>
> as the original message from Kevin Jung never got accepted to the
> mailinglis
could you find the shuffle files? or the files were deleted by other processes?
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月5日, at 下午11:14, Yifan LI wrote:
>
> Hi,
>
> I am running a heavy memory/cpu overhead graphx application, I think the
> memory is sufficient and set RDDs’
what's the dump info by jstack?
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月6日, at 上午10:20, Michael Albert
> wrote:
>
> My apologies for following up my own post, but I thought this might be of
> interest.
>
> I terminated the java process corresponding to executor whic
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
--
~Yours, Xuefeng Wu/吴雪峰 敬上
how about save as object?
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月30日, at 下午9:27, Jason Hong wrote:
>
> Dear all:)
>
> We're trying to make a graph using large input data and get a subgraph
> applied some filter.
>
> Now, we wanna save this graph to HDFS so that
looks good.
I concern about the foldLeftByKey which looks break the consistence from
foldLeft in RDD and aggregateByKey in PairRDD
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月4日, at 上午7:47, Koert Kuipers wrote:
>
> fold
I have similar requirememt,take top N by key. right now I use groupByKey,but
one key would group more than half data in some dataset.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月4日, at 上午7:26, Nathan Kronenfeld
> wrote:
>
> I think it would depend on the type and amount of inform
hi Debasish,
I found test code in map translate,
would it collect all products too?
+ val sortedProducts = products.toArray.sorted(ord.reverse)
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年12月2日, at 上午1:33, Debasish Das wrote:
>
> rdd.top collects it on master...
>
> If you want top
Scores = for {
(_, ageScores) <- takeTop(scores, _.age)
(_, numScores) <- takeTop(ageScores, _.num)
} yield {
numScores
}
topScores.size
--
~Yours, Xuefeng Wu/吴雪峰 敬上
4 at 5:39 PM, Xuefeng Wu wrote:
>
>> scala> import scala.collection.GenSeq
>> scala> val seq = GenSeq("This", "is", "an", "example")
>>
>> scala> seq.aggregate("0")(_ + _, _ + _)
>> res0: String = 0Th
there is docker script for spark 0.9 in spark git
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年8月10日, at 下午8:27, 诺铁 wrote:
>
> hi, all,
>
> I am playing with docker, trying to create a spark cluster with docker
> containers.
>
> since spark master, worker, driver all nee
atMap and what
> is a good use case for each?
>
> --
> Eran | CTO
>
--
~Yours, Xuefeng Wu/吴雪峰 敬上
Hi Aureliaono,
First, docker is not ready for production, unless you know what are doing and
prepared for some risk.
Then, in my opinion , there are so many hard code in spark docker script, you
have to modify it for your goal.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2014年3月10日, at 上午12
13 matches
Mail list logo