Re: [Spark-Core] Long scheduling delays (1+ hour)

2018-11-07 Thread Biplob Biswas
will end up having a lot of scheduling delay. Maybe see, why does it take 1 min to process 100 records and fix the logic. Also, I see you have higher number of events which takes some time lower amount of processing time. Fix the code logic and this should be fixed. Thanks & Regards Biplob Bi

Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation regarding how to run spark-shell on k8s cluster?

2018-10-31 Thread Biplob Biswas
Hi Yuqi, Just curious can you share the spark-submit script and what are you passing as --master argument? Thanks & Regards Biplob Biswas On Wed, Oct 31, 2018 at 10:34 AM Gourav Sengupta wrote: > Just out of curiosity why would you not use Glue (which is Spark on > kubernet

Re: unsubsribe

2018-10-30 Thread Biplob Biswas
You need to send the email to user-unsubscr...@spark.apache.org and not to the usergroup. Thanks & Regards Biplob Biswas On Tue, Oct 30, 2018 at 10:59 AM Anu B Nair wrote: > I am sending this Unsubscribe mail for last few months! It never happens! > If anyone can help us to unsubscr

Re: Replacing groupBykey() with reduceByKey()

2018-08-08 Thread Biplob Biswas
(build_edges, 25) Although based on the return type you would have to modify your build_edges function. Thanks & Regards Biplob Biswas On Mon, Aug 6, 2018 at 6:28 PM Bathi CCDB wrote: > Hey Bipin, > Thanks for the reply, I am actually aggregating after the groupByKey() > operation, &g

Re: Replacing groupBykey() with reduceByKey()

2018-08-06 Thread Biplob Biswas
Hi Santhosh, If you are not performing any aggregation, then I don't think you can replace your groupbykey with a reducebykey, and as I see you are only grouping and taking 2 values of the result, thus I believe you can't just replace your groupbykey with that. Thanks & Regards Biplob Bi

Re: Backpressure initial rate not working

2018-07-26 Thread Biplob Biswas
Hi Todd, Thanks a lot, that works. Althouhg I am curious whether you know why the initialRate setting not kicking in? But for now the pipeline is usable again. Thanks a lot. Thanks & Regards Biplob Biswas On Thu, Jul 26, 2018 at 3:03 PM Todd Nist wrote: > Have you tried r

Re: Backpressure initial rate not working

2018-07-26 Thread Biplob Biswas
--class "${MAIN_CLASS}" \ "${ARTIFACT_FILE}" The first batch is huge, even if it worked for the first batch I would've tried researching more. The problem is that the first batch is more than 500k records. Thanks & Regards Biplob Biswas On Thu, Jul 26, 2018 a

Re: Backpressure initial rate not working

2018-07-26 Thread Biplob Biswas
Did anyone face similar issue? and any viable way to solve this? Thanks & Regards Biplob Biswas On Wed, Jul 25, 2018 at 4:23 PM Biplob Biswas wrote: > I have enabled the spark.streaming.backpressure.enabled setting and also > set spark.streaming.backpressure.initialRate to 150

Backpressure initial rate not working

2018-07-25 Thread Biplob Biswas
and they are all taken in 1 huge batch which ultimately takes a long time and fails with executor failure exception. We don't have more resources to give in our test cluster and we expect the backpressure to kick in and take smaller batches. What can I be doing wrong? Thanks & Regards Biplob Biswas

Re: Using newApiHadoopRDD for reading from HBase

2018-06-29 Thread Biplob Biswas
Can someone please help me out here, maybe point to some documentation for the same? I couldn't find almost anything. Thanks & Regards Biplob Biswas On Thu, Jun 28, 2018 at 11:13 AM Biplob Biswas wrote: > Hi, > > I had a few questions regarding the way *newApiHadoopRDD *accesse

Using newApiHadoopRDD for reading from HBase

2018-06-28 Thread Biplob Biswas
correct? 3. If it does load all the data from the scan operation, what happens when the data size is more than executor memory? 4. What happens when we have a huge number of column qualifiers for a given row ? Thanks & Regards Biplob Biswas

Spark Jobs ends when assignment not found for Kafka Partition

2018-05-17 Thread Biplob Biswas
util.ShutdownHookManager: Shutdown hook called Thanks & Regards Biplob Biswas

Re: Prefer Structured Streaming over Spark Streaming (DStreams)?

2018-02-02 Thread Biplob Biswas
eaming runs jobs on RDD's via dataframes and in the future, if the RDD abstraction needs to be switched, it will be done by removing RDD with something else. Please correct me if I understood this wrong. Thanks & Regards Biplob Biswas On Thu, Feb 1, 2018 at 12:12 AM, Michael Armb

Prefer Structured Streaming over Spark Streaming (DStreams)?

2018-01-31 Thread Biplob Biswas
Hi, I read an article which recommended to use dataframes instead of rdd primitives. Now I read about the differences over using DStreams and Structured Streaming and structured streaming adds a lot of improvements like checkpointing, windowing, sessioning, fault tolerance etc. What I am

Conflict resolution for data in spark streaming

2017-07-24 Thread Biplob Biswas
to fix this issue? I am not really sure if anyone faced this kind of any issue and if someone fixed anything like this? Thanks & Regards Biplob Biswas

Re: Spark Streaming - Increasing number of executors slows down processing rate

2017-06-20 Thread Biplob Biswas
this delay. Although I am not really sure it feels its some issue with kafka - spark integration but can't say for sure. Regards, Biplob Thanks & Regards Biplob Biswas On Tue, Jun 20, 2017 at 5:42 AM, Mal Edwin <mal.ed...@vinadionline.com> wrote: > Hi All, > > I am strugglin

[Spark Structured Streaming] Exception while using watermark with type of timestamp

2017-06-06 Thread Biplob Biswas
Hi, I am playing around with Spark structured streaming and we have a use case to use this as a CEP engine. I am reading from 3 different kafka topics together. I want to perform windowing on this structured stream and then run some queries on this block on a sliding scale. Also, all of this

[Spark Structured Streaming] Exception while using watermark with type of timestamp

2017-06-06 Thread Biplob Biswas
Hi, I am playing around with Spark structured streaming and we have a use case to use this as a CEP engine. I am reading from 3 different kafka topics together. I want to perform windowing on this structured stream and then run some queries on this block on a sliding scale. Also, all of this

Re: StreamingKmeans Spark doesn't work at all

2016-07-11 Thread Biplob Biswas
Hi Shuai, Thanks for the reply, I mentioned in the mail that I tried running the scala example as well from the link I provided and the result is the same. Thanks & Regards Biplob Biswas On Mon, Jul 11, 2016 at 5:52 AM, Shuai Lin <linshuai2...@gmail.com> wrote: > I would su

Re: StreamingKmeans Spark doesn't work at all

2016-07-10 Thread Biplob Biswas
grateful! Thanks a lot. Thanks & Regards Biplob Biswas On Thu, Jul 7, 2016 at 5:21 PM, Biplob Biswas <revolutioni...@gmail.com> wrote: > Hi, > > Can anyone care to please look into this issue? I would really love some > assistance here. > > Thanks a lot. > >

Re: StreamingKmeans Spark doesn't work at all

2016-07-07 Thread Biplob Biswas
Hi, Can anyone care to please look into this issue? I would really love some assistance here. Thanks a lot. Thanks & Regards Biplob Biswas On Tue, Jul 5, 2016 at 1:00 PM, Biplob Biswas <revolutioni...@gmail.com> wrote: > > Hi, > > I implemented the streamingK

StreamingKmeans Spark doesn't work at all

2016-07-05 Thread Biplob Biswas
no output on my eclipse window ... just the Time! Can anyone seriously help me with this? Thank you so much Biplob Biswas -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/StreamingKmeans-Spark-doesn-t-work-at-all-tp27286.html Sent from the Apa

Re: Working of Streaming Kmeans

2016-07-03 Thread Biplob Biswas
Hi, Can anyone please explain this? Thanks & Regards Biplob Biswas On Sat, Jul 2, 2016 at 4:48 PM, Biplob Biswas <revolutioni...@gmail.com> wrote: > Hi, > > I wanted to ask a very basic question about the working of Streaming > Kmeans. > > Does the model up

Working of Streaming Kmeans

2016-07-02 Thread Biplob Biswas
Hi, I wanted to ask a very basic question about the working of Streaming Kmeans. Does the model update only when training (i.e. training dataset is used) or does it update on the PredictOnValues function as well for the test dataset? Thanks and Regards Biplob -- View this message in

Running JavaBased Implementation of StreamingKmeans Spark

2016-06-26 Thread Biplob Biswas
ave already tried putting the data files after starting the code but still no output, i am also not getting any exception or anything, so its hard to debug for me. I am very new to the spark systems and any help is highly appreciated. Thank you so much Biplob Biswas

Re: Running JavaBased Implementation of StreamingKmeans Spark

2016-06-25 Thread Biplob Biswas
Hi, I tried doing that but even then I couldn't see any results. I started the program and added the files later. Thanks & Regards Biplob Biswas On Sat, Jun 25, 2016 at 2:19 AM, Jayant Shekhar <jayantbaya...@gmail.com> wrote: > Hi Biplop, > > Can you try adding new files t

Running JavaBased Implementation of StreamingKmeans Spark

2016-06-24 Thread Biplob Biswas
t; and "samplegpsdata_test.txt" with training data having 500 datapoints and test data with 60 datapoints. I am very new to the spark systems and any help is highly appreciated. Thank you so much Biplob Biswas -- View this message in context: http://apache-spark-user-list.1001560.n3.nab

Running JavaBased Implementation of StreamingKmeans Spark

2016-06-22 Thread Biplob Biswas
t; and "samplegpsdata_test.txt" with training data having 500 datapoints and test data with 60 datapoints. I am very new to the spark systems and any help is highly appreciated. Thank you so much Biplob Biswas -- View this message in context: http://apache-spark-user-list.1001560.n3.nab

Re: Running JavaBased Implementationof StreamingKmeans

2016-06-21 Thread Biplob Biswas
Hi, Can someone please look into this and tell me whats wrong?and why am I not getting any output? Thanks & Regards Biplob Biswas On Sun, Jun 19, 2016 at 1:29 PM, Biplob Biswas <revolutioni...@gmail.com> wrote: > Hi, > > Thanks for that input, I tried doing that b

Re: Running JavaBased Implementationof StreamingKmeans

2016-06-19 Thread Biplob Biswas
understand the syntax of scala very well thus wrote my own implementation of streaming kmeans in java, so i am hoping thats correct. Thanks & Regards Biplob Biswas On Sun, Jun 19, 2016 at 3:23 AM, Akhil Das <ak...@hacked.work> wrote: > SparkStreaming does not pick up old files by default

Re: Running JavaBased Implementationof StreamingKmeans

2016-06-18 Thread Biplob Biswas
Hi, I tried local[*] and local[2] and the result is the same. I don't really understand the problem here. How can I confirm that the files are read properly? Thanks & Regards Biplob Biswas On Sat, Jun 18, 2016 at 5:59 PM, Akhil Das <ak...@hacked.work> wrote: > Looks like you nee

Running JavaBased Implementation of StreamingKmeans

2016-06-18 Thread Biplob Biswas
psdata_test.txt" with training data having 500 datapoints and test data with 60 datapoints. I am very new to the spark systems and any help is highly appreciated. Thanks & Regards Biplob Biswas

Running JavaBased Implementationof StreamingKmeans

2016-06-17 Thread Biplob Biswas
t; and "samplegpsdata_test.txt" with training data having 500 datapoints and test data with 60 datapoints. I am very new to the spark systems and any help is highly appreciated. Thank you so much Biplob Biswas -- View this message in context: http://apache-spark-user-list.1001560.n3.nabbl

Running Java Implementationof StreamingKmeans

2016-06-17 Thread Biplob Biswas
t; and "samplegpsdata_test.txt" with training data having 500 datapoints and test data with 60 datapoints. I am very new to the spark systems and any help is highly appreciated. Thank you so much Biplob Biswas -- View this message in context: http://apache-spark-user-list.1001560.n3.

Running Java Implementationof StreamingKmeans

2016-06-17 Thread Biplob Biswas
psdata_test.txt" with training data having 500 datapoints and test data with 60 datapoints. I am very new to the spark systems and any help is highly appreciated. Thank you so much Biplob Biswas -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Runni