Re: Codegen In Shuffle

2015-11-04 Thread
I see. Thanks very much. 2015-11-04 16:25 GMT+08:00 Reynold Xin <r...@databricks.com>: > GenerateUnsafeProjection -- projects any internal row data structure > directly into bytes (UnsafeRow). > > > On Wed, Nov 4, 2015 at 12:21 AM, 牛兆捷 <nzjem...@gmail.com> wrote: &g

Codegen In Shuffle

2015-11-04 Thread
Dear all: Tungsten project has mentioned that they are applying code generation is to speed up the conversion of data from in-memory binary format to wire-protocol for shuffle. Where can I find the related implementation in spark code-based ? -- *Regards,* *Zhaojie*

RDD checkpoint

2015-07-13 Thread
The checkpointed RDD computed twice, why not do the checkpoint for the RDD once it is computed? Is there any special reason for this? -- *Regards,* *Zhaojie*

Re: Questions about Fault tolerance of Spark

2015-07-11 Thread
message From: 牛兆捷 Date:07-09-2015 04:19 (GMT-05:00) To: dev@spark.apache.org, u...@spark.apache.org Subject: Questions about Fault tolerance of Spark Hi All: We already know that Spark utilizes the lineage to recompute the RDDs when failure occurs. I want to study the performance

Questions about Fault tolerance of Spark

2015-07-09 Thread
Hi All: We already know that Spark utilizes the lineage to recompute the RDDs when failure occurs. I want to study the performance of this fault-tolerant approach and have some questions about it. 1) Is there any benchmark (or standard failure model) to test the fault tolerance of these kinds of

Workload for spark testing

2014-09-13 Thread
Hi All: We know some memory of spark are used for computing (e.g., spark.shuffle.memoryFraction) and some are used for caching RDD for future use (e.g., spark.storage.memoryFraction). Is there any existing workload which can utilize both of them during the running left cycle? I want to do some

Re: memory size for caching RDD

2014-09-04 Thread
at 8:13 PM, 牛兆捷 nzjem...@gmail.com wrote: Dear all: Spark uses memory to cache RDD and the memory size is specified by spark.storage.memoryFraction. One the Executor starts, does Spark support adjusting/resizing memory size of this part dynamically? Thanks. -- *Regards

Re: memory size for caching RDD

2014-09-04 Thread
Thanks raymond. I duplicated the question. Please see the reply here. [?] 2014-09-04 14:27 GMT+08:00 牛兆捷 nzjem...@gmail.com: But is it possible to make t resizable? When we don't have many RDD to cache, we can give some memory to others. 2014-09-04 13:45 GMT+08:00 Patrick Wendell pwend

Re: memory size for caching RDD

2014-09-04 Thread
is that this is done by RDD unit, not by block unit. And then, if the storage level including disk level, the data on the disk will be removed too. Best Regards, Raymond Liu From: 牛兆捷 [mailto:nzjem...@gmail.com] Sent: Thursday, September 04, 2014 2:57 PM To: Liu, Raymond Cc: Patrick Wendell; u

memory size for caching RDD

2014-09-03 Thread
Dear all: Spark uses memory to cache RDD and the memory size is specified by spark.storage.memoryFraction. One the Executor starts, does Spark support adjusting/resizing memory size of this part dynamically? Thanks. -- *Regards,* *Zhaojie*

acquire and give back resources dynamically

2014-08-13 Thread
Dear all: Does spark can acquire resources from and give back resources to YARN dynamically ? -- *Regards,* *Zhaojie*