will end
up having a lot of scheduling delay.
Maybe see, why does it take 1 min to process 100 records and fix the logic.
Also, I see you have higher number of events which takes some time lower
amount of processing time. Fix the code logic and this should be fixed.
Thanks & Regards
Biplob Bi
Hi Yuqi,
Just curious can you share the spark-submit script and what are you passing
as --master argument?
Thanks & Regards
Biplob Biswas
On Wed, Oct 31, 2018 at 10:34 AM Gourav Sengupta
wrote:
> Just out of curiosity why would you not use Glue (which is Spark on
> kubernet
You need to send the email to user-unsubscr...@spark.apache.org and not to
the usergroup.
Thanks & Regards
Biplob Biswas
On Tue, Oct 30, 2018 at 10:59 AM Anu B Nair wrote:
> I am sending this Unsubscribe mail for last few months! It never happens!
> If anyone can help us to unsubscr
(build_edges, 25)
Although based on the return type you would have to modify your build_edges
function.
Thanks & Regards
Biplob Biswas
On Mon, Aug 6, 2018 at 6:28 PM Bathi CCDB wrote:
> Hey Bipin,
> Thanks for the reply, I am actually aggregating after the groupByKey()
> operation,
&g
Hi Santhosh,
If you are not performing any aggregation, then I don't think you can
replace your groupbykey with a reducebykey, and as I see you are only
grouping and taking 2 values of the result, thus I believe you can't just
replace your groupbykey with that.
Thanks & Regards
Biplob Bi
Hi Todd,
Thanks a lot, that works. Althouhg I am curious whether you know why the
initialRate setting not kicking in?
But for now the pipeline is usable again. Thanks a lot.
Thanks & Regards
Biplob Biswas
On Thu, Jul 26, 2018 at 3:03 PM Todd Nist wrote:
> Have you tried r
--class "${MAIN_CLASS}" \
"${ARTIFACT_FILE}"
The first batch is huge, even if it worked for the first batch I would've
tried researching more. The problem is that the first batch is more than
500k records.
Thanks & Regards
Biplob Biswas
On Thu, Jul 26, 2018 a
Did anyone face similar issue? and any viable way to solve this?
Thanks & Regards
Biplob Biswas
On Wed, Jul 25, 2018 at 4:23 PM Biplob Biswas
wrote:
> I have enabled the spark.streaming.backpressure.enabled setting and also
> set spark.streaming.backpressure.initialRate to 150
and
they are all taken in 1 huge batch which ultimately takes a long time and
fails with executor failure exception. We don't have more resources to give
in our test cluster and we expect the backpressure to kick in and take
smaller batches.
What can I be doing wrong?
Thanks & Regards
Biplob Biswas
Can someone please help me out here, maybe point to some documentation for
the same? I couldn't find almost anything.
Thanks & Regards
Biplob Biswas
On Thu, Jun 28, 2018 at 11:13 AM Biplob Biswas
wrote:
> Hi,
>
> I had a few questions regarding the way *newApiHadoopRDD *accesse
correct?
3. If it does load all the data from the scan operation, what happens when
the data size is more than executor memory?
4. What happens when we have a huge number of column qualifiers for a given
row ?
Thanks & Regards
Biplob Biswas
util.ShutdownHookManager: Shutdown hook called
Thanks & Regards
Biplob Biswas
eaming runs jobs on RDD's via dataframes and in the future, if the RDD
abstraction needs to be switched, it will be done by removing RDD with
something else. Please correct me if I understood this wrong.
Thanks & Regards
Biplob Biswas
On Thu, Feb 1, 2018 at 12:12 AM, Michael Armb
Hi,
I read an article which recommended to use dataframes instead of rdd
primitives. Now I read about the differences over using DStreams and
Structured Streaming and structured streaming adds a lot of improvements
like checkpointing, windowing, sessioning, fault tolerance etc.
What I am
to fix this issue? I am not really sure if anyone faced
this kind of any issue and if someone fixed anything like this?
Thanks & Regards
Biplob Biswas
this delay. Although I am not really sure it
feels its some issue with kafka - spark integration but can't say for sure.
Regards,
Biplob
Thanks & Regards
Biplob Biswas
On Tue, Jun 20, 2017 at 5:42 AM, Mal Edwin <mal.ed...@vinadionline.com>
wrote:
> Hi All,
>
> I am strugglin
Hi,
I am playing around with Spark structured streaming and we have a use case
to use this as a CEP engine.
I am reading from 3 different kafka topics together. I want to perform
windowing on this structured stream and then run some queries on this block
on a sliding scale. Also, all of this
Hi,
I am playing around with Spark structured streaming and we have a use case
to use this as a CEP engine.
I am reading from 3 different kafka topics together. I want to perform
windowing on this structured stream and then run some queries on this block
on a sliding scale. Also, all of this
Hi Shuai,
Thanks for the reply, I mentioned in the mail that I tried running the
scala example as well from the link I provided and the result is the same.
Thanks & Regards
Biplob Biswas
On Mon, Jul 11, 2016 at 5:52 AM, Shuai Lin <linshuai2...@gmail.com> wrote:
> I would su
grateful! Thanks a lot.
Thanks & Regards
Biplob Biswas
On Thu, Jul 7, 2016 at 5:21 PM, Biplob Biswas <revolutioni...@gmail.com>
wrote:
> Hi,
>
> Can anyone care to please look into this issue? I would really love some
> assistance here.
>
> Thanks a lot.
>
>
Hi,
Can anyone care to please look into this issue? I would really love some
assistance here.
Thanks a lot.
Thanks & Regards
Biplob Biswas
On Tue, Jul 5, 2016 at 1:00 PM, Biplob Biswas <revolutioni...@gmail.com>
wrote:
>
> Hi,
>
> I implemented the streamingK
no output on my eclipse window ... just the Time!
Can anyone seriously help me with this?
Thank you so much
Biplob Biswas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/StreamingKmeans-Spark-doesn-t-work-at-all-tp27286.html
Sent from the Apa
Hi,
Can anyone please explain this?
Thanks & Regards
Biplob Biswas
On Sat, Jul 2, 2016 at 4:48 PM, Biplob Biswas <revolutioni...@gmail.com>
wrote:
> Hi,
>
> I wanted to ask a very basic question about the working of Streaming
> Kmeans.
>
> Does the model up
Hi,
I wanted to ask a very basic question about the working of Streaming Kmeans.
Does the model update only when training (i.e. training dataset is used) or
does it update on the PredictOnValues function as well for the test dataset?
Thanks and Regards
Biplob
--
View this message in
ave already tried putting the data files after starting the code but
still no output, i am also not getting any exception or anything, so its
hard to debug for me.
I am very new to the spark systems and any help is highly appreciated.
Thank you so much
Biplob Biswas
Hi,
I tried doing that but even then I couldn't see any results. I started the
program and added the files later.
Thanks & Regards
Biplob Biswas
On Sat, Jun 25, 2016 at 2:19 AM, Jayant Shekhar <jayantbaya...@gmail.com>
wrote:
> Hi Biplop,
>
> Can you try adding new files t
t;
and "samplegpsdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.
I am very new to the spark systems and any help is highly appreciated.
Thank you so much
Biplob Biswas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nab
t;
and "samplegpsdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.
I am very new to the spark systems and any help is highly appreciated.
Thank you so much
Biplob Biswas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nab
Hi,
Can someone please look into this and tell me whats wrong?and why am I not
getting any output?
Thanks & Regards
Biplob Biswas
On Sun, Jun 19, 2016 at 1:29 PM, Biplob Biswas <revolutioni...@gmail.com>
wrote:
> Hi,
>
> Thanks for that input, I tried doing that b
understand the syntax
of scala very well thus wrote my own implementation of streaming kmeans in
java, so i am hoping thats correct.
Thanks & Regards
Biplob Biswas
On Sun, Jun 19, 2016 at 3:23 AM, Akhil Das <ak...@hacked.work> wrote:
> SparkStreaming does not pick up old files by default
Hi,
I tried local[*] and local[2] and the result is the same. I don't really
understand the problem here.
How can I confirm that the files are read properly?
Thanks & Regards
Biplob Biswas
On Sat, Jun 18, 2016 at 5:59 PM, Akhil Das <ak...@hacked.work> wrote:
> Looks like you nee
psdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.
I am very new to the spark systems and any help is highly appreciated.
Thanks & Regards
Biplob Biswas
t;
and "samplegpsdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.
I am very new to the spark systems and any help is highly appreciated.
Thank you so much
Biplob Biswas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabbl
t;
and "samplegpsdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.
I am very new to the spark systems and any help is highly appreciated.
Thank you so much
Biplob Biswas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.
psdata_test.txt" with training data having 500 datapoints and
test data with 60 datapoints.
I am very new to the spark systems and any help is highly appreciated.
Thank you so much
Biplob Biswas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Runni
35 matches
Mail list logo