Re: Test fails when compiling spark with tests

2016-09-13 Thread Jakob Odersky
There are some flaky tests that occasionally fail, my first
recommendation would be to re-run the test suite. Another thing to
check is if there are any applications listening to spark's default
ports.
Btw, what is your environment like? In case it is windows, I don't
think tests are regularly run against that platform and therefore
could very well be broken.

On Sun, Sep 11, 2016 at 10:49 PM, assaf.mendelson
 wrote:
> Hi,
>
> I am trying to set up a spark development environment. I forked the spark
> git project and cloned the fork. I then checked out branch-2.0 tag (which I
> assume is the released source code).
>
> I then compiled spark twice.
>
> The first using:
>
> mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -DskipTests clean package
>
> This compiled successfully.
>
> The second using mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 clean
> package
>
> This got a failure in Spark Project Core with the following test failing:
>
> - caching in memory and disk, replicated
>
> - caching in memory and disk, serialized, replicated *** FAILED ***
>
>   java.util.concurrent.TimeoutException: Can't find 2 executors before 3
> milliseconds elapsed
>
>   at
> org.apache.spark.ui.jobs.JobProgressListener.waitUntilExecutorsUp(JobProgressListener.scala:573)
>
>   at
> org.apache.spark.DistributedSuite.org$apache$spark$DistributedSuite$$testCaching(DistributedSuite.scala:154)
>
>   at
> org.apache.spark.DistributedSuite$$anonfun$32$$anonfun$apply$1.apply$mcV$sp(DistributedSuite.scala:191)
>
>   at
> org.apache.spark.DistributedSuite$$anonfun$32$$anonfun$apply$1.apply(DistributedSuite.scala:191)
>
>   at
> org.apache.spark.DistributedSuite$$anonfun$32$$anonfun$apply$1.apply(DistributedSuite.scala:191)
>
>   at
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>
>   ...
>
> - compute without caching when no partitions fit in memory
>
>
>
> I made no changes to the code whatsoever. Can anyone help me figure out what
> is wrong with my environment?
>
> BTW I am using maven 3.3.9 and java 1.8.0_101-b13
>
>
>
> Thanks,
>
> Assaf
>
>
> 
> View this message in context: Test fails when compiling spark with tests
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



REST api for monitoring Spark Streaming

2016-09-13 Thread Chan Chor Pang

Hi everyone,

Trying to monitoring our streaming application using Spark REST interface
only to found that there is no such thing for Streaming.

I wonder if anyone already working on this or I should just start 
implementing my own one?


--
BR
Peter Chan


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Nominal Attribute

2016-09-13 Thread Danil Kirsanov
NominalAttribute in MLib is used to represent categorical data internally. 
It is barely documented though and has a number of limitations: for example,
it supports only integer and string data. 
Is there any current effort to expose it (and categorical data handling in
general) to the users, or is it intended to be an internal MLib data
representation only?

Thank you,
Danil



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Nominal-Attribute-tp18935.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Spark SQL - Applying transformation on a struct inside an array

2016-09-13 Thread Olivier Girardot
Hi everyone,I'm currently trying to create a generic transformation mecanism on
a Dataframe to modify an arbitrary column regardless of the underlying the
schema.
It's "relatively" straightforward for complex types like struct> to
apply an arbitrary UDF on the column and replace the data "inside" the struct,
however I'm struggling to make it work for complex types containing arrays along
the way like struct>>.
Michael Armbrust seemed to allude on the mailing list/forum to a way of using
Encoders to do that, I'd be interested in any pointers, especially considering
that it's not possible to output any Row or GenericRowWithSchema from a UDF
(thanks to
https://github.com/apache/spark/blob/v2.0.0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L657
it seems).
To sum up, I'd like to find a way to apply a transformation on complex nested
datatypes (arrays and struct) on a Dataframe updating the value itself.
Regards,
Olivier Girardot