Re: df.groupBy('m).agg(sum('n)).show dies with 10^3 elements?

2016-09-06 Thread Jacek Laskowski
Hi Josh, Yes, that seems to be the issue. As I commented out in the JIRA, just yesterday (after I had sent the email), such simple queries like the following killed spark-shell: Seq(1).toDF.groupBy('value).count.show Hoping to see it get resolved soon. If there's anything I could help you with

Unable to run docker jdbc integrations test ?

2016-09-06 Thread Suresh Thalamati
Hi, I am getting the following error , when I am trying to run jdbc docker integration tests on my laptop. Any ideas , what I might be be doing wrong ? build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive-thriftserver -Phive -DskipTests clean install build/mvn

BlockMatrix Multiplication fails with Out of Memory

2016-09-06 Thread vinodep
Hi, I am trying to multiply Matrix of size 67584*67584 in a loop. In the first iteration, multiplication goes through, but in the second iteration, it fails with Java heap out of memory issue. I'm using pyspark and below is the configuration. Setup: 70 nodes (1driver+69 workers) with

df.groupBy('m).agg(sum('n)).show dies with 10^3 elements?

2016-09-06 Thread Jacek Laskowski
Hi, I'm concerned with the OOME in local mode with the version built today: scala> val intsMM = 1 to math.pow(10, 3).toInt intsMM: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,