Eclipse Scala IDE/Scala test and Wiki

2014-06-02 Thread Madhu
I was able to set up Spark in Eclipse using the Spark IDE plugin.
I also got unit tests running with Scala Test, which makes development quick
and easy.

I wanted to document the setup steps in this wiki page:

https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-IDESetup

I can't seem to edit that page.
Confluence usually has a an Edit button in the upper right, but it does
not appear for me, even though I am logged in.

Am I missing something?



-
--
Madhu
https://www.linkedin.com/in/msiddalingaiah
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Eclipse-Scala-IDE-Scala-test-and-Wiki-tp6908.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.


Re: [VOTE] Release Apache Spark 1.0.0 (rc5)

2014-06-02 Thread Marcelo Vanzin
Hi Patrick,

Thanks for all the explanations, that makes sense. @DeveloperApi
worries me a little bit especially because of the things Colin
mentions - it's sort of hard to make people move off of APIs, or
support different versions of the same API. But maybe if expectations
(or lack thereof) are set up front, there will be less issues.

You mentioned something in your shading argument that kinda reminded
me of something. Spark currently depends on slf4j implementations and
log4j with compile scope. I'd argue that's the wrong approach if
we're talking about Spark being used embedded inside applications;
Spark should only depend on the slf4j API package, and let the
application provide the underlying implementation.

The assembly jars could include an implementation (since I assume
those are currently targeted at cluster deployment and not embedding).

That way there is less sources of conflict at runtime (i.e. the
multiple implementation jars messages you can see when running some
Spark programs).

On Fri, May 30, 2014 at 10:54 PM, Patrick Wendell pwend...@gmail.com wrote:
 2. Many libraries like logging subsystems, configuration systems, etc
 rely on static state and initialization. I'm not totally sure how e.g.
 slf4j initializes itself if you have both a shaded and non-shaded copy
 of slf4j present.

-- 
Marcelo


Re: [VOTE] Release Apache Spark 1.0.0 (rc5)

2014-06-02 Thread Sean Owen
On Mon, Jun 2, 2014 at 6:05 PM, Marcelo Vanzin van...@cloudera.com wrote:
 You mentioned something in your shading argument that kinda reminded
 me of something. Spark currently depends on slf4j implementations and
 log4j with compile scope. I'd argue that's the wrong approach if
 we're talking about Spark being used embedded inside applications;
 Spark should only depend on the slf4j API package, and let the
 application provide the underlying implementation.

Good idea in general; in practice, the drawback is that you can't do
things like set log levels if you only depend on the SLF4J API. There
are a few cases where that's nice to control, and that's only possible
if you bind to a particular logger as well.

You typically bundle a SLF4J binding anyway, to give a default, or
else the end-user has to know to also bind some SLF4J logger to get
output. Of course it does make for a bit more surgery if you want to
override the binding this way.

Shading can bring a whole new level of confusion; I myself would only
use it where essential as a workaround. Same with trying to make more
elaborate custom classloading schemes -- never in my darkest
nightmares have I imagine the failure modes that probably pop up when
that goes wrong. I think the library collisions will get better over
time as only later versions of Hadoop are in scope, for example,
and/or one build system is in play. I like tackling complexity along
those lines first.


Which version does the binary compatibility test against by default?

2014-06-02 Thread Xiangrui Meng
Is there a way to specify the target version? -Xiangrui


Re: Eclipse Scala IDE/Scala test and Wiki

2014-06-02 Thread Matei Zaharia
Madhu, can you send me your Wiki username? (Sending it just to me is fine.) I 
can add you to the list to edit it.

Matei

On Jun 2, 2014, at 6:27 PM, Reynold Xin r...@databricks.com wrote:

 I tried but didn't find where I could add you. You probably need Matei to 
 help out with this.
 
 
 
 On Mon, Jun 2, 2014 at 7:43 AM, Madhu ma...@madhu.com wrote:
 I was able to set up Spark in Eclipse using the Spark IDE plugin.
 I also got unit tests running with Scala Test, which makes development quick
 and easy.
 
 I wanted to document the setup steps in this wiki page:
 
 https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-IDESetup
 
 I can't seem to edit that page.
 Confluence usually has a an Edit button in the upper right, but it does
 not appear for me, even though I am logged in.
 
 Am I missing something?
 
 
 
 -
 --
 Madhu
 https://www.linkedin.com/in/msiddalingaiah
 --
 View this message in context: 
 http://apache-spark-developers-list.1001551.n3.nabble.com/Eclipse-Scala-IDE-Scala-test-and-Wiki-tp6908.html
 Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
 



Re: Which version does the binary compatibility test against by default?

2014-06-02 Thread Patrick Wendell
Yeah - check out sparkPreviousArtifact in the build:
https://github.com/apache/spark/blob/master/project/SparkBuild.scala#L325

- Patrick

On Mon, Jun 2, 2014 at 5:30 PM, Xiangrui Meng men...@gmail.com wrote:
 Is there a way to specify the target version? -Xiangrui


Spark 1.1-snapshot: java.io.FileNotFoundException from ShuffleMapTask

2014-06-02 Thread npanj
Quite often I notice that shuffle file is missing thus FileNotFoundException
is throws.
Any idea why shuffle file will be missing ? Am I running low in memory?
(I am using latest code from master branch on yarn-hadoop-2.2)

--
java.io.FileNotFoundException:
/var/storage/sda3/nm-local/usercache/npanj/appcache/application_1401394632504_0131/spark-local-20140603050956-6728/20/shuffle_0_2_97
(No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.init(FileOutputStream.java:221)
at
org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:116)
at
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:177)
at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:158)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
--



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-1-snapshot-java-io-FileNotFoundException-from-ShuffleMapTask-tp6915.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.