Re: spark performance non-linear response

2015-10-07 Thread Yadid Ayzenberg
Additional missing relevant information: Im running a transformation, there are no Shuffles occurring and at the end im performing a lookup of 4 partitions on the driver. On 10/7/15 11:26 AM, Yadid Ayzenberg wrote: Hi All, Im using spark 1.4.1 to to analyze a largish data set (several

spark performance non-linear response

2015-10-07 Thread Yadid Ayzenberg
should tweak in order to improve the performance? Or perhaps provide an explanation as to the behavior Im witnessing? Yadid

Re: spark 1.4.1 - LZFException

2015-09-03 Thread Yadid Ayzenberg
) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) On 9/3/15 2:25 PM, Yadid Ayzenberg wrote: Hi Akhil, No, it seems I

Re: spark 1.4.1 - LZFException

2015-09-03 Thread Yadid Ayzenberg
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) You think that is related to the problem ? Yadid On 8/28/15 1:31 AM, Akhil Das wrote: Is it filling up your disk space? Can you look a bit more in the executor logs to see whats

spark 1.4.1 - LZFException

2015-08-22 Thread Yadid Ayzenberg
? Yadid Job aborted due to stage failure: Task 27 in stage 286.0 failed 4 times, most recent failure: Lost task 27.3 in stage 286.0 (TID 516817, xx.yy.zz.ww): com.esotericsoftware.kryo.KryoException: com.ning.compress.lzf.LZFException: Corrupt input data, block did not start with 2 byte

combining python and java in a single Spark application

2014-10-06 Thread Yadid Ayzenberg
comes to mind is to pass the data to an external python process via pipe, but this seems cumbersome. Can someone provide a better alternative? Yadid - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional

Re: Change delimiter when collecting SchemaRDD

2014-08-29 Thread yadid ayzenberg
Thanks Michael, that makes total sense. It works perfectly. Yadid On Thu, Aug 28, 2014 at 9:19 PM, Michael Armbrust wrote: > The comma is just the way the default toString works for Row objects. > Since SchemaRDDs are also RDDs, you can do arbitrary transformations on > the Row obj

Change delimiter when collecting SchemaRDD

2014-08-28 Thread yadid ayzenberg
Hi All, Is there any way to change the delimiter from being a comma ? Some of the strings in my data contain commas as well, making it very difficult to parse the results. Yadid

Losing Executors on cluster with RDDs of 100GB

2014-08-22 Thread Yadid Ayzenberg
60:53855 ] The worker logs and executor logs do not contain errors. Any ideas what the problem is ? Yadid - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

SPARKSQL problem with implementing Scala's Product interface

2014-07-10 Thread yadid
te expression. type: UnresolvedAttribute, tree: 'param1 I guess I must be missing a method in the implementation. Any pointers appreciated. Yadid -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SPARKSQL-problem-with-implementing-Scala-s-Product-in

Re: possible typos in spark 1.0 documentation

2014-05-31 Thread Yadid Ayzenberg
Yep, I just issued a pull request. Yadid On 5/31/14, 1:25 PM, Patrick Wendell wrote: 1. ctx is an instance of JavaSQLContext but the textFile method is called as a member of ctx. According to the API JavaSQLContext does not have such a member, so im guessing this should be sc instead. Yeah

possible typos in spark 1.0 documentation

2014-05-30 Thread Yadid Ayzenberg
ile method is called as a member of ctx. According to the API JavaSQLContext does not have such a member, so im guessing this should be sc instead. 2. In that same code example the object sqlCtx is referenced, but it is never instantiated in the code. should this be ctx? Cheers, Yadid

Re: NoSuchMethodError: breeze.linalg.DenseMatrix

2014-05-04 Thread Yadid Ayzenberg
An additional option 4) Use SparkContext.addJar() and have the application ship your jar to all the nodes. Yadid On 5/4/14, 4:07 PM, DB Tsai wrote: If you add the breeze dependency in your build.sbt project, it will not be available to all the workers. There are couple options, 1) use sbt

Re: Strange lookup behavior. Possible bug?

2014-04-30 Thread Yadid Ayzenberg
Dear Sparkers, Has anyone got any insight on this ? I am really stuck. Yadid On 4/28/14, 11:28 AM, Yadid Ayzenberg wrote: Thanks for your answer. I tried running on a single machine - master and worker on one host. I get exactly the same results. Very little CPU activity on the machine in

Re: Strange lookup behavior. Possible bug?

2014-04-28 Thread Yadid Ayzenberg
r(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) On 4/28/14 11:28 AM, Yadid Ayzenberg wrote: Thanks for your answer. I tried running on a single machine - master and worker on one host. I get exactly the same results. Very little CPU activity on the machine

Re: Strange lookup behavior. Possible bug?

2014-04-28 Thread Yadid Ayzenberg
RDD is created by using the newAPIHadoopRDD. Any additional info I can provide? Yadid On 4/28/14 10:46 AM, Daniel Darabos wrote: That is quite mysterious, and I do not think we have enough information to answer. JavaPairRDD.lookup() works fine on a remote Spark cluster: $ MASTER=spark

Re: Strange lookup behavior. Possible bug?

2014-04-27 Thread Yadid Ayzenberg
:37 PM, Yadid Ayzenberg wrote: Some additional information - maybe this rings a bell with someone: I suspect this happens when the lookup returns more than one value. For 0 and 1 values, the function behaves as you would expect. Anyone ? On 4/25/14, 1:55 PM, Yadid Ayzenberg wrote: Hi All, Im

Re: Strange lookup behavior. Possible bug?

2014-04-25 Thread Yadid Ayzenberg
Some additional information - maybe this rings a bell with someone: I suspect this happens when the lookup returns more than one value. For 0 and 1 values, the function behaves as you would expect. Anyone ? On 4/25/14, 1:55 PM, Yadid Ayzenberg wrote: Hi All, Im running a lookup on a

Strange lookup behavior. Possible bug?

2014-04-25 Thread Yadid Ayzenberg
debug this problem ? Thanks, Yadid