Re: Update Batch DF with Streaming

2016-06-19 Thread Amit Assudani
Please help From: amit assudani <aassud...@impetus.com> Date: Thursday, June 16, 2016 at 6:11 PM To: "user@spark.apache.org" <user@spark.apache.org> Subject: Update Batch DF with Streaming Hi All, Can I update batch data frames loaded in memory with Streaming data

Update Batch DF with Streaming

2016-06-16 Thread Amit Assudani
Hi All, Can I update batch data frames loaded in memory with Streaming data, For eg, I have employee DF is registered as temporary table, it has EmployeeID, Name, Address, etc. fields, and assuming it is very big and takes time to load in memory, I've two types of employee events (both

Re: How to recover in case user errors in streaming

2015-07-01 Thread Amit Assudani
: Monday, June 29, 2015 at 5:24 PM To: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Cc: Cody Koeninger c...@koeninger.orgmailto:c...@koeninger.org, user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: How to recover

Re: How to recover in case user errors in streaming

2015-06-29 Thread Amit Assudani
, June 27, 2015 at 5:14 AM To: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Cc: Cody Koeninger c...@koeninger.orgmailto:c...@koeninger.org, user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: How to recover in case

Re: How to recover in case user errors in streaming

2015-06-29 Thread Amit Assudani
Also, how do you suggest catching exceptions while using with connector API like, saveAsNewAPIHadoopFiles ? From: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Date: Monday, June 29, 2015 at 9:55 AM To: Tathagata Das t...@databricks.commailto:t...@databricks.com Cc: Cody

Checkpoint FS failure or connectivity issue

2015-06-29 Thread Amit Assudani
Hi All, While using Checkpoints ( using HDFS ), if connectivity to hadoop cluster is lost for a while and gets restored in some time, what happens to the running streaming job. Is it always assumed that connection to checkpoint FS ( this case HDFS ) would ALWAYS be HA and would never fail for

Re: How to recover in case user errors in streaming

2015-06-26 Thread Amit Assudani
in hand. Regards, Amit From: Cody Koeninger c...@koeninger.orgmailto:c...@koeninger.org Date: Friday, June 26, 2015 at 11:32 AM To: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Cc: user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org

How to recover in case user errors in streaming

2015-06-26 Thread Amit Assudani
Problem: how do we recover from user errors (connectivity issues / storage service down / etc.)? Environment: Spark streaming using Kafka Direct Streams Code Snippet: HashSetString topicsSet = new HashSetString(Arrays.asList(kafkaTopic1)); HashMapString, String kafkaParams = new HashMapString,

Re: How to recover in case user errors in streaming

2015-06-26 Thread Amit Assudani
:16 AM To: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Cc: user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org, Tathagata Das t...@databricks.commailto:t...@databricks.com Subject: Re: How to recover in case user errors in streaming

Re: How to recover in case user errors in streaming

2015-06-26 Thread Amit Assudani
Also, what I understand is, max failures doesn’t stop the entire stream, it fails the job created for the specific batch, but the subsequent batches still proceed, isn’t it right ? And question still remains, how to keep track of those failed batches ? From: amit assudani aassud

Re: How to recover in case user errors in streaming

2015-06-26 Thread Amit Assudani
Also, I get TaskContext.get() null when used in foreach function below ( I get it when I use it in map, but the whole point here is to handle something that is breaking in action ). Please help. :( From: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Date: Friday, June 26, 2015

Lookup / Access of master data in spark streaming

2015-04-09 Thread Amit Assudani
Hi Friends, I am trying to solve a use case in spark streaming, I need help on getting to right approach on lookup / update the master data. Use case ( simplified ) I've a dataset of entity with three attributes and identifier/row key in a persistent store. Each attribute along with row key

Re: Lookup / Access of master data in spark streaming

2015-04-09 Thread Amit Assudani
, Amit From: Tathagata Das t...@databricks.commailto:t...@databricks.com Date: Thursday, April 9, 2015 at 3:13 PM To: amit assudani aassud...@impetus.commailto:aassud...@impetus.com Cc: user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org Subject: Re