RE: Question about GraphX connected-components

2015-10-12 Thread John Lilley
Geoff Thompson <geoff.thomp...@redpoint.net> Subject: Re: Question about GraphX connected-components let's start from some basics: might be u need to split your data into more partitions? spilling depends on your configuration when you create graph(look for storage level param) and

Re: Question about GraphX connected-components

2015-10-10 Thread Igor Berman
let's start from some basics: might be u need to split your data into more partitions? spilling depends on your configuration when you create graph(look for storage level param) and your global configuration. in addition, you assumption of 64GB/100M is probably wrong, since spark divides memory