javascript:_e(%7B%7D,'cvml','user@spark.apache.org');
*Subject:* Re: How to share large resources like dictionaries while
processing data with Spark ?
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun 4, 2015 at 6:44 PM, Yiannis Gkoufas johngou...@gmail.com
javascript:_e(%7B%7D,'cvml
[mailto:dgoldenberg...@gmail.com]
Sent: Friday, June 5, 2015 12:12 AM
To: Yiannis Gkoufas
Cc: Olivier Girardot; user@spark.apache.org
Subject: Re: How to share large resources like dictionaries while processing
data with Spark ?
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun 4, 2015 at 6
@spark.apache.org'
Subject: RE: How to share large resources like dictionaries while processing
data with Spark ?
It is called Indexed RDD https://github.com/amplab/spark-indexedrdd
From: Dmitry Goldenberg [mailto:dgoldenberg...@gmail.com]
Sent: Friday, June 5, 2015 3:15 PM
To: Evo
: How to share large resources like dictionaries while
processing data with Spark ?
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun 4, 2015 at 6:44 PM, Yiannis Gkoufas johngou...@gmail.com
javascript:_e(%7B%7D,'cvml','johngou...@gmail.com'); wrote:
Hi there,
I would recommend
Goldenberg [mailto:dgoldenberg...@gmail.com]
Sent: Friday, June 5, 2015 12:12 AM
To: Yiannis Gkoufas
Cc: Olivier Girardot; user@spark.apache.org
Subject: Re: How to share large resources like dictionaries while processing
data with Spark ?
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun
:* Friday, June 5, 2015 12:12 AM
*To:* Yiannis Gkoufas
*Cc:* Olivier Girardot; user@spark.apache.org
*Subject:* Re: How to share large resources like dictionaries while
processing data with Spark ?
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun 4, 2015 at 6:44 PM, Yiannis Gkoufas
, 2015 12:12 AM
To: Yiannis Gkoufas
Cc: Olivier Girardot; user@spark.apache.org
javascript:_e(%7B%7D,'cvml','user@spark.apache.org');
Subject: Re: How to share large resources like dictionaries while processing
data with Spark ?
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun 4, 2015 at 6
You can use it as a broadcast variable, but if it's too large (more than
1Gb I guess), you may need to share it joining this using some kind of key
to the other RDDs.
But this is the kind of thing broadcast variables were designed for.
Regards,
Olivier.
Le jeu. 4 juin 2015 à 23:50, dgoldenberg
Is the dictionary read-only?
Did you look at
http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables ?
-Original Message-
From: dgoldenberg [mailto:dgoldenberg...@gmail.com]
Sent: Thursday, June 04, 2015 4:50 PM
To: user@spark.apache.org
Subject: How to share
Hi there,
I would recommend checking out
https://github.com/spark-jobserver/spark-jobserver which I think gives the
functionality you are looking for.
I haven't tested it though.
BR
On 5 June 2015 at 01:35, Olivier Girardot ssab...@gmail.com wrote:
You can use it as a broadcast variable, but
Thanks so much, Yiannis, Olivier, Huang!
On Thu, Jun 4, 2015 at 6:44 PM, Yiannis Gkoufas johngou...@gmail.com
wrote:
Hi there,
I would recommend checking out
https://github.com/spark-jobserver/spark-jobserver which I think gives
the functionality you are looking for.
I haven't tested it
11 matches
Mail list logo