[GraphX] - OOM Java Heap Space

2018-10-28 Thread Thodoris Zois
Hello, I have the edges of a graph stored as parquet files (about 3GB). I am loading the graph and trying to compute the total number of triplets and triangles. Here is my code: val edges_parq = sqlContext.read.option("header","true").parquet(args(0) + "/year=" + year) val edges:

[Spark-GraphX] Conductance, Bridge Ratio & Diameter

2018-10-18 Thread Thodoris Zois
Hello, I am trying to compute conductance, bridge ratio and diameter on a given graph but I face some problems. - For the conductance my problem is how to compute the cuts so that they are kinda semi-clustered. Is the partitioningBy from GraphX related to dividing a graph into multiple

Re: Spark on Mesos - Weird behavior

2018-07-23 Thread Thodoris Zois
n. Anyway, i >> don't know does this parameters works without dynamic allocation. >> >>> On Wed, Jul 11, 2018 at 5:11 PM Thodoris Zois wrote: >>> Hello, >>> >>> Yeah you are right, but I think that works only if you use Spark dynamic >>>

Re: Spark on Mesos - Weird behavior

2018-07-11 Thread Thodoris Zois
ters instead using spark.max.cores. I think > spark.dynamicAllocation.minExecutors and spark.dynamicAllocation.maxExecutors > configuration values can help you. > > On Tue, Jul 10, 2018 at 5:07 PM Thodoris Zois <mailto:z...@ics.forth.gr>> wrote: > Actually after some exper

Re: Spark on Mesos - Weird behavior

2018-07-10 Thread Thodoris Zois
for example, but have with > 8 or 9, so you can use smaller executers for better fit for available > resources on nodes for example with 4 cores and 1 GB RAM, for example > > Cheers, > Pavel > >> On Mon, Jul 9, 2018 at 9:05 PM Thodoris Zois wrote: >> Hello list,

Spark on Mesos - Weird behavior

2018-07-09 Thread Thodoris Zois
Hello list, We are running Apache Spark on a Mesos cluster and we face a weird behavior of executors. When we submit an app with e.g 10 cores and 2GB of memory and max cores 30, we expect to see 3 executors running on the cluster. However, sometimes there are only 2... Spark applications are

Re: Spark 2.3 driver pod stuck in Running state — Kubernetes

2018-06-08 Thread Thodoris Zois
As far as I know from Mesos with Spark, it is a running state and not a pending one. What you see is normal, but if I am wrong somebody correct me. Spark driver at start operates normally (running state) but when it comes to start up executors, then it cannot allocate resources for them and

Re: Read or save specific blocks of a file

2018-05-03 Thread Thodoris Zois
ntial data corruption issues. Appreciate if you please share some > details of your approach. > > > Thanks! > madhav > On Wed, May 2, 2018 at 3:34 AM, Thodoris Zois <z...@ics.forth.gr> > wrote: > > That’s what I did :) If you need further information I can post my

Re: ML Linear and Logistic Regression - Poor Performance

2018-04-27 Thread Thodoris Zois
tly for logistic (meaning 0 & 1's) before > modeling? What are OS and spark version you using? > > Thank You, > > Irving Duran > > > On Fri, Apr 27, 2018 at 2:34 PM Thodoris Zois <z...@ics.forth.gr > <mailto:z...@ics.forth.gr>> wrote: > H

ML Linear and Logistic Regression - Poor Performance

2018-04-27 Thread Thodoris Zois
Hello, I am running an experiment to test logistic and linear regression on spark using MLlib. My dataset is only 128MB and something weird happens. Linear regression takes about 127 seconds either with 1 or 500 iterations. On the other hand, logistic regression most of the times does not

Re: Scala program to spark-submit on k8 cluster

2018-04-06 Thread Thodoris Zois
If you are looking for a Spark scheduler that runs on top of Kubernetes then this is the way to go: https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala You can also have a

1 Executor per partition

2018-04-04 Thread Thodoris Zois
Hello list! I am trying to familiarize with Apache Spark. I would like to ask something about partitioning and executors. Can I have e.g: 500 partitions but launch only one executor that will run operations in only 1 partition of the 500? And then I would like my job to die. Is there any