Re: Spark + Kinesis

2015-05-09 Thread Vadim Bichutskiy
Thanks Chris! I was just looking to get back to Spark + Kinesis integration. Will be in touch shortly. Vadim ᐧ On Sun, May 10, 2015 at 12:14 AM, Chris Fregly wrote: > hey vadim- > > sorry for the delay. > > if you're interested in trying to get Kinesis working one-on-one, shoot me > a direct em

Re: Spark + Kinesis

2015-05-09 Thread Chris Fregly
hey vadim- sorry for the delay. if you're interested in trying to get Kinesis working one-on-one, shoot me a direct email and we'll get it going off-list. we can circle back and summarize our findings here. lots of people are using Spark Streaming+Kinesis successfully. would love to help you t

Re: Intellij Spark Source Compilation

2015-05-09 Thread rtimp
Hi Iulian, Thanks for the reply! With respect to eclipse, I'm doing this all with a fresh download of the scala ide (Build id: 4.0.0-vfinal-20150305-1644-Typesafe) and with a recent pull (as of this morning) of the master branch.When I proceed through the instructions for eclipse (creating the p

PySpark DataFrame: Preserving nesting when selecting a nested field

2015-05-09 Thread Nicholas Chammas
Take a look: >>> df = sqlContext.jsonRDD(sc.parallelize(['{"settings": {"os": "OS X", >>> "version": "10.10"}}']))>>> df.printSchema() root |-- settings: struct (nullable = true) ||-- os: string (nullable = true) ||-- version: string (nullable = true) >>> # Now I want to "drop" the ver

Re: Having pyspark.sql.types.StructType implement __iter__()

2015-05-09 Thread Nicholas Chammas
I've filed SPARK-7507 for this. On Fri, May 8, 2015 at 5:57 PM Reynold Xin wrote: > Sure. > > > On Fri, May 8, 2015 at 2:43 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> StructType looks an awful lot like a Python dictionary.

Re: DataFrames equivalent to SQL table namespacing and aliases

2015-05-09 Thread Nicholas Chammas
I've opened an issue for a few doc fixes that the PySpark DataFrame API needs: SPARK-7505 On Fri, May 8, 2015 at 3:10 PM Nicholas Chammas wrote: > Ah, neat. So in the example I gave earlier, I’d do this to get columns > from specific dataframes:

Re: pyspark.sql.types.StructType.fromJson() is a lie

2015-05-09 Thread Nicholas Chammas
I've reported this in SPARK-7506 . On Thu, May 7, 2015 at 6:58 PM Nicholas Chammas wrote: > Renaming fields to get around SPARK-2775 > . > > I’m doing this clunky thing: > >1. Convert a DataFr

Re: Integration with Apache Ignite

2015-05-09 Thread dsetrakyan
rxin wrote > There is some work to create an off-heap storage API for Spark. I think > with it, it will be easier to support different storage backends. > > https://issues.apache.org/jira/browse/SPARK-6479 > > With that API in place, rest of the integration should probably just live > outside of

Spark featured on Research Podcast

2015-05-09 Thread Brock Palen
Thanks to Matei for taking time to talk to us! You can find the full interview at: http://www.rce-cast.com/Podcast/spark.html If you have topics for future shows please contact me off list. Brock Palen www.umich.edu/~brockp Assoc. Director Advanced Research Computing - TS XSEDE Campus Champion b

Re: Intellij Spark Source Compilation

2015-05-09 Thread Iulian Dragoș
On Sat, May 9, 2015 at 12:29 AM, rtimp wrote: > Hello, > > I'm trying to compile the master branch of the spark source (25889d8) in > intellij. I followed the instructions in the wiki > https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools, > namely I downloaded IntelliJ 14.1.2

Re: Build fail...

2015-05-09 Thread Sean Owen
Ah! of course. That one is my fault. Thank you Andrew for fixing that up. On Sat, May 9, 2015 at 3:13 AM, Andrew Or wrote: > Thanks for pointing this out. I reverted that commit. > > 2015-05-08 19:01 GMT-07:00 Ted Yu : > >> Looks like you're right: >> >> >> https://amplab.cs.berkeley.edu/jenkins/