Regarding master node failure

2015-07-07 Thread swetha
Hi, What happens if a master node fails in the case of Spark Streaming? Would the data be lost in that case? Thanks, Swetha -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-master-node-failure-tp13055.html Sent from the Apache Spark

Data interaction between various RDDs in Spark Streaming

2015-07-07 Thread swetha
Hi, Suppose I want the data to be grouped by and Id named 12345 and I have certain amount of data coming out from one batch for 12345 and I have data related to 12345 coming after 5 hours, how do I group by 12345 and have a single RDD of list? Thanks, Swetha -- View this message in context:

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-07 Thread Andrew Or
+1 Verified that the previous blockers SPARK-8781 and SPARK-8819 are now resolved. 2015-07-07 12:06 GMT-07:00 Patrick Wendell pwend...@gmail.com: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0,

Re: Data interaction between various RDDs in Spark Streaming

2015-07-07 Thread Akhil Das
UpdatestateByKey? Thanks Best Regards On Wed, Jul 8, 2015 at 1:05 AM, swetha swethakasire...@gmail.com wrote: Hi, Suppose I want the data to be grouped by and Id named 12345 and I have certain amount of data coming out from one batch for 12345 and I have data related to 12345 coming after

[RESULT] [VOTE] Release Apache Spark 1.4.1 (RC2)

2015-07-07 Thread Patrick Wendell
Hey All, This vote is cancelled in favor of RC3. - Patrick On Fri, Jul 3, 2015 at 1:15 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed here:

Re: Unable to add to roles in JIRA

2015-07-07 Thread Reynold Xin
BTW Infra has the ability to create multiple groups. Maybe that's a better solution. Have contributor1, contributor2, contributor3 ... On Tue, Jul 7, 2015 at 1:42 PM, Sean Owen so...@cloudera.com wrote: Yeah, I've just realized a problem, that the permission for Developer are not the same as

Re: Unable to add to roles in JIRA

2015-07-07 Thread Reynold Xin
I've been adding people to the developer role to get around the jira limit. On Tue, Jul 7, 2015 at 3:05 AM, Sean Owen so...@cloudera.com wrote: PS the resolution on this is just that we've hit a JIRA limit, since the Contributor role is so big now. We have a currently-unused Developer role

spark - redshift !!!

2015-07-07 Thread spark user
Hi Can you help me how to load data from s3 bucket to  redshift , if you gave sample code can you pls send me  Thanks su

Re: Unable to add to roles in JIRA

2015-07-07 Thread Sean Owen
PS the resolution on this is just that we've hit a JIRA limit, since the Contributor role is so big now. We have a currently-unused Developer role that barely has different permissions. I propose to move people that I recognize as regular Contributors into the Developer group to make room.

TableScan vs PrunedScan

2015-07-07 Thread Gil Vernik
Hi All, I wanted to experiment a little bit with TableScan and PrunedScan. My first test was to print columns from various SQL queries. To make this test easier, i just took spark-csv and i replaced TableScan with PrunedScan. I then changed buildScan method of CsvRelation from def BuildScan

Re: TableScan vs PrunedScan

2015-07-07 Thread Ram Sriharsha
Hi Gil You would need to prune the resulting Row as well based on the requested columns. Ram Sent from my iPhone On Jul 7, 2015, at 3:12 AM, Gil Vernik g...@il.ibm.com wrote: Hi All, I wanted to experiment a little bit with TableScan and PrunedScan. My first test was to print

thrift server reliability issue

2015-07-07 Thread Judy Nash
Hi everyone, Found a thrift server reliability issue on spark 1.3.1 that causes thrift to fail. When thrift server has too little memory allocated to the driver to process the request, its Spark SQL session exits with OutOfMemory exception, causing thrift server to stop working. Is this a

Spark job hangs when History server events are written to hdfs

2015-07-07 Thread Pankaj Arora
Hi, I am running long running application over yarn using spark and I am facing issues while using spark’s history server when the events are written to hdfs. It seems to work fine for some time and in between I see following exception. 2015-06-01 00:00:03,247 [SparkListenerBus] ERROR