pyspark does not seem to start py4j callback server

2015-11-24 Thread girishlg
Hi We have a use case where we call a scala function with a python object as a callback. The python object implements a scala trait. The call to scala function goes through but when it makes a call back through the passed in python object we get a connection refused error. Looking further we

Re: A proposal for Spark 2.0

2015-11-24 Thread Sandy Ryza
I think that Kostas' logic still holds. The majority of Spark users, and likely an even vaster majority of people running vaster jobs, are still on RDDs and on the cusp of upgrading to DataFrames. Users will probably want to upgrade to the stable version of the Dataset / DataFrame API so they

Re: [ANNOUNCE] Spark 1.6.0 Release Preview

2015-11-24 Thread Ted Yu
If I am not mistaken, the binaries for Scala 2.11 were generated against hadoop 1. What about binaries for Scala 2.11 against hadoop 2.x ? Cheers On Sun, Nov 22, 2015 at 2:21 PM, Michael Armbrust wrote: > In order to facilitate community testing of Spark 1.6.0, I'm

Re: A proposal for Spark 2.0

2015-11-24 Thread Matei Zaharia
What are the other breaking changes in 2.0 though? Note that we're not removing Scala 2.10, we're just making the default build be against Scala 2.11 instead of 2.10. There seem to be very few changes that people would worry about. If people are going to update their apps, I think it's better

sqlContext vs hivecontext

2015-11-24 Thread Pranay Tonpay
hi , when i am using hivecontext, i am able to query for individual columns from a table as against when using sqlContext where only a select * works Is is possible to use sqlContext and still query for specific columns from a hive table ?

what should I know to implement twitter streaming for pyspark?

2015-11-24 Thread Amir Rahnama
I wanna end the situation where python users of spark need to implement the twitter source for streaming by themselves. Yuhu! Anything I need to know, looked at scala and Java code and got some ideas. -- Thanks and Regards, Amir Hossein Rahnama *Tel: +46 (0) 761 681 102* Website:

Re: what should I know to implement twitter streaming for pyspark?

2015-11-24 Thread Julio Antonio Soto de Vicente
Hi Amir, I believe that the first step should be looking for a library that implements the streaming API. > El 24/11/2015, a las 10:32, Amir Rahnama escribió: > > I wanna end the situation where python users of spark need to implement the > twitter source for

Streaming : stopping output transformations explicitly

2015-11-24 Thread Yogesh Mahajan
Hi, Is there a way to stop output transformations on a stream without stopping streamingContext ? Yogesh Mahajan SnappayData Inc, OLTP+OLAP inside Spark for real time analytics