Hi All,
After a few simple transformations I am trying to save to a local file
system. The code works in local mode but not on a standalone cluster. The
directory *1.txt/_temporary* does exist after the exception.
I would appreciate any suggestions.
*scala d3.sample(false,0.01,1).map( pair
Increasing number of partitions on data file solved the problem.
On 6 June 2014 18:46, Oleg Proudnikov oleg.proudni...@gmail.com wrote:
Additional observation - the map and mapValues are pipelined and executed
- as expected - in pairs. This means that there is a simple sequence of
steps
the driver's JVM
to be 8g, rather than just the executors. I think this is the reason for
why SPARK_MEM was deprecated. See https://github.com/apache/spark/pull/99
On Thu, Jun 5, 2014 at 2:37 PM, Oleg Proudnikov oleg.proudni...@gmail.com
wrote:
Thank you, Andrew,
I am using Spark 0.9.1
Thank you, Hassan!
On 6 June 2014 03:23, hassan hellfire...@gmail.com wrote:
just use -Dspark.executor.memory=
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Setting-executor-memory-when-using-spark-shell-tp7082p7103.html
Sent from the Apache Spark
Hi All,
I am passing Java static methods into RDD transformations map and
mapValues. The first map is from a simple string K into a (K,V) where V is
a Java ArrayList of large text strings, 50K each, read from Cassandra.
MapValues does processing of these text blocks into very small ArrayLists.
.
Thank you again,
Oleg
On 6 June 2014 18:05, Patrick Wendell pwend...@gmail.com wrote:
In 1.0+ you can just pass the --executor-memory flag to ./bin/spark-shell.
On Fri, Jun 6, 2014 at 12:32 AM, Oleg Proudnikov
oleg.proudni...@gmail.com wrote:
Thank you, Hassan!
On 6 June 2014 03:23
2014 16:24, Oleg Proudnikov oleg.proudni...@gmail.com wrote:
Hi All,
I am passing Java static methods into RDD transformations map and
mapValues. The first map is from a simple string K into a (K,V) where V is
a Java ArrayList of large text strings, 50K each, read from Cassandra.
MapValues
Hi All,
Please help me set Executor JVM memory size. I am using Spark shell and it
appears that the executors are started with a predefined JVM heap of 512m
as soon as Spark shell starts. How can I change this setting? I tried
setting SPARK_EXECUTOR_MEMORY before launching Spark shell:
export
=$MEMORY_PER_EXECUTOR
It doesn't seem particularly clean, but it works.
Andrew
On Thu, Jun 5, 2014 at 2:15 PM, Oleg Proudnikov oleg.proudni...@gmail.com
wrote:
Hi All,
Please help me set Executor JVM memory size. I am using Spark shell and
it appears that the executors are started with a predefined
Just a thought... Are you trying to use use the RDD as a Map?
On 3 June 2014 23:14, Doris Xin doris.s@gmail.com wrote:
Hey Amit,
You might want to check out PairRDDFunctions
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions.
For your use
It is possible if you use a cartesian product to produce all possible
pairs for each IP address and 2 stages of map-reduce:
- first by pairs of points to find the total of each pair and
- second by IP address to find the pair for each IP address with the
maximum count.
Oleg
On 4 June 2014
HI All,
Is it possible to run a standalone app that would compute and persist/cache
an RDD and then run other standalone apps that would gain access to that
RDD?
--
Thank you,
Oleg
Anwar,
Will try this as it might do exactly what I need. I will follow your
pattern but use sc.textFile() for each file.
I am now thinking that I could start with an RDD of file paths and map it
into (path, content) pairs, provided I could read a file on the server.
Thank you,
Oleg
On 1 June
13 matches
Mail list logo