Re: Running Spark on a single machine

2014-03-17 Thread goi cto
, Mar 16, 2014 at 11:39 PM, goi cto goi@gmail.com wrote: Hi, I know it is probably not the purpose of spark but the syntax is easy and cool... I need to run some spark like code in memory on a single machine any pointers how to optimize it to run only on one machine? -- Eran | CTO

Running Spark on a single machine

2014-03-16 Thread goi cto
Hi, I know it is probably not the purpose of spark but the syntax is easy and cool... I need to run some spark like code in memory on a single machine any pointers how to optimize it to run only on one machine? -- Eran | CTO

How to work with ReduceByKey?

2014-03-13 Thread goi cto
Hi, I have an RDD with S,Tuple2I,List which I want to reduceByKey and get I+I and List of List (add the integers and build a list of the lists. BUT reduce by key requires that the return value is of the same type of the input so I can combine the lists.

Re: Problem with delete spark temp dir on spark 0.8.1

2014-03-04 Thread goi cto
. Eran On Tue, Mar 4, 2014 at 11:36 AM, Akhil Das ak...@mobipulse.in wrote: Hi, Try to clean your temp dir, System.getProperty(java.io.tmpdir) Also, Can you paste a longer stacktrace? Thanks Best Regards On Tue, Mar 4, 2014 at 2:55 PM, goi cto goi@gmail.com wrote: Hi, I am

Beginners Hadoop question

2014-03-03 Thread goi cto
Hi, I am sorry for the beginners question but... I have a spark java code which reads a file (c:\my-input.csv) process it and writes an output file (my-output.csv) Now I want to run it on Hadoop in a distributed environment 1) My inlut file should be one big file or separate smaller files? 2) if