date:20170510

How about the fetch the shuffle data in one same machine?

2017-05-10 Thread raintung li

Hi all,

Now Spark only think the executorId same that fetch local file, but for
same IP different ExecutorId will fetch using network that actually it can
be fetch in the local Or Loopback.

Apparently fetch the local file that it is fast that can use the LVS cache.

How do you think?

Regards
-Raintung

Re: How about the fetch the shuffle data in one same machine?

2017-05-10 Thread Saisai Shao

There is a JIRA about this thing (
https://issues.apache.org/jira/browse/SPARK-6521). In the current Spark
shuffle fetch still leverages Netty even two executors are on the same
node, but according to the test on the JIRA, the performance is close
whether to bypass network or not. From my understanding, kernel will not
transfer data into NIC if it is just a loopback communication (please
correct me if I'm wrong).

On Wed, May 10, 2017 at 5:53 PM, raintung li  wrote:

> Hi all,
>
> Now Spark only think the executorId same that fetch local file, but for
> same IP different ExecutorId will fetch using network that actually it can
> be fetch in the local Or Loopback.
>
> Apparently fetch the local file that it is fast that can use the LVS
> cache.
>
> How do you think?
>
> Regards
> -Raintung
>

spark.sql.codegen.comments not in SQLConf?

2017-05-10 Thread Jacek Laskowski

Hi,

It seems that spark.sql.codegen.comments property [1] didn't find its
place in SQLConf [2] that appears to be the place for all Spark
SQL-related properties (for codegen surely).

Don't think it merits a JIRA issue so just asking here.

If agreed, I'd like to propose a PR. Thanks.

[1] 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L822
[2] 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: How about the fetch the shuffle data in one same machine?

2017-05-10 Thread raintung li

I don't think it is Loopback only localhost or 127.0.0.1 will go.
And the benchmarks test code should be simple don't involve calculate.
Just make two test codes
one just read the file from local
the other just read the file from netty

Read the different file size(small -> big), should have different
benchmarks. Of cause the memory copy fast than network deliver.

On Wed, May 10, 2017 at 6:14 PM, Saisai Shao  wrote:

> There is a JIRA about this thing (https://issues.apache.org/
> jira/browse/SPARK-6521). In the current Spark shuffle fetch still
> leverages Netty even two executors are on the same node, but according to
> the test on the JIRA, the performance is close whether to bypass network or
> not. From my understanding, kernel will not transfer data into NIC if it is
> just a loopback communication (please correct me if I'm wrong).
>
> On Wed, May 10, 2017 at 5:53 PM, raintung li 
> wrote:
>
>> Hi all,
>>
>> Now Spark only think the executorId same that fetch local file, but for
>> same IP different ExecutorId will fetch using network that actually it can
>> be fetch in the local Or Loopback.
>>
>> Apparently fetch the local file that it is fast that can use the LVS
>> cache.
>>
>> How do you think?
>>
>> Regards
>> -Raintung
>>
>
>

Re: spark.sql.codegen.comments not in SQLConf?

2017-05-10 Thread Reynold Xin

It's probably because it is annoying to propagate that using SQL conf.
On Wed, May 10, 2017 at 3:38 AM Jacek Laskowski  wrote:

> Hi,
>
> It seems that spark.sql.codegen.comments property [1] didn't find its
> place in SQLConf [2] that appears to be the place for all Spark
> SQL-related properties (for codegen surely).
>
> Don't think it merits a JIRA issue so just asking here.
>
> If agreed, I'd like to propose a PR. Thanks.
>
> [1]
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L822
> [2]
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

How about the fetch the shuffle data in one same machine?

Re: How about the fetch the shuffle data in one same machine?

spark.sql.codegen.comments not in SQLConf?

Re: How about the fetch the shuffle data in one same machine?

Re: spark.sql.codegen.comments not in SQLConf?

5 matches

Site Navigation

Mail list logo

Footer information