date:20180219

Re: [Spark Streaming]: Non-deterministic uneven task-to-machine assignment

2018-02-19 Thread vijay.bvp

apologies for the long answer. understanding partitioning at each stage of the the RDD graph/lineage is important for efficient parallelism and having load balanced. This applies to working with any sources streaming or static. you have tricky situation here of one source kafka with 9

sqoop import job not working when spark thrift server is running.

2018-02-19 Thread akshay naidu

Hello , I was trying to optimize my spark cluster. I did it to some extent by doing some changes in yarn-site.xml and spark-defaults.conf file. before the changes the mapreduce import job was running fine along with slow thrift server. after changes, i have to kill the thrift server to execute my

Re: Does Pyspark Support Graphx?

2018-02-19 Thread xiaobo

When using the --jars option, we should include it every time we submit a job , it seems add the jars to the classpath to every slave node a spark is only way to "install" spark packages. -- Original -- From: Nicholas Hakobian

Re: [graphframes]how Graphframes Deal With BidirectionalRelationships

2018-02-19 Thread xiaobo

So the question comes to does graphframes support bidirectional relationship natively with only one edge? -- Original -- From: Felix Cheung Date: Tue,Feb 20,2018 10:01 AM To: xiaobo , user@spark.apache.org

Errors when running unit tests

2018-02-19 Thread karuppayya

Hi , I get errors like below when trying to run the spark unit tests zipPartitions(test.org.apache.spark.Java8RDDAPISuite) Time elapsed: 2.212 > sec <<< ERROR! > java.lang.IllegalStateException: failed to create a child event loop > at

Re: [graphframes]how Graphframes Deal With Bidirectional Relationships

2018-02-19 Thread Felix Cheung

Generally that would be the approach. But since you have effectively double the number of edges this will likely affect the scale your job will run. From: xiaobo Sent: Monday, February 19, 2018 3:22:02 AM To: user@spark.apache.org Subject:

Re: KafkaUtils.createStream(..) is removed for API

2018-02-19 Thread Cody Koeninger

I can't speak for committers, but my guess is it's more likely for DStreams in general to stop being supported before that particular integration is removed. On Sun, Feb 18, 2018 at 9:34 PM, naresh Goud wrote: > Thanks Ted. > > I see createDirectStream is

Re: Does Pyspark Support Graphx?

2018-02-19 Thread Nicholas Hakobian

If you copy the Jar file and all of the dependencies to the machines, you can manually add them to the classpath. If you are using Yarn and HDFS you can alternatively use --jars and point it to the hdfs locations of the jar files and it will (in most cases) distribute them to the worker nodes at

Re: [Spark Streaming]: Non-deterministic uneven task-to-machine assignment

2018-02-19 Thread Aleksandar Vitorovic

Hi Vijay, Thank you very much for your reply. Setting the number of partitions explicitly in the join, and memory pressure influence on partitioning were definitely very good insights. At the end, we avoid the issue of uneven load balancing completely by doing the following two: a) Reducing the

Understand task timing

2018-02-19 Thread Thomas Decaux

Using Spark 1.6.2, I want to understand what « Duration » really mean (and why is slow). Running a simple SELECT COUNT against a parquet file, stored within HDFS: NODE_LOCAL 1 / DATA02 2018/02/19 09:54:27 5 s 30 ms 8.8 MB (hadoop) / 3010830 8 ms 77.2 KB / 1666 This means "took 5 secondes to

Unsubscribe

2018-02-19 Thread Ryan Myer

- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: [Spark Streaming]: Non-deterministic uneven task-to-machine assignment

sqoop import job not working when spark thrift server is running.

Re: Does Pyspark Support Graphx?

Re: [graphframes]how Graphframes Deal With BidirectionalRelationships

Errors when running unit tests

Re: [graphframes]how Graphframes Deal With Bidirectional Relationships

Re: KafkaUtils.createStream(..) is removed for API

Re: Does Pyspark Support Graphx?

Re: [Spark Streaming]: Non-deterministic uneven task-to-machine assignment

Understand task timing

Unsubscribe

11 matches

Site Navigation

Mail list logo

Footer information