Re: What is the best way for Spark to read HDF5@scale?

2018-09-17 Thread Saurav Sinha
nnector for HDF5? > > The following link does not work anymore? > > https://www.hdfgroup.org/downloads/spark-connector/ > down vo > > Thanks, > > Kathleen > -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Setting spark.yarn.stagingDir in 1.6

2017-03-15 Thread Saurav Sinha
in spark 1.6. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Help in generating unique Id in spark row

2016-10-17 Thread Saurav Sinha
Can any one help me out On Mon, Oct 17, 2016 at 7:27 PM, Saurav Sinha <sauravsinh...@gmail.com> wrote: > Hi, > > I am in situation where I want to generate unique Id for each row. > > I have use monotonicallyIncreasingId but it is giving increasing values > and sta

Help in generating unique Id in spark row

2016-10-17 Thread Saurav Sinha
null| |null|2439d6db-16a2-44b...| +----+--------+ -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Finding unique across all columns in dataset

2016-09-19 Thread Saurav Sinha
the columns of rdd and write it to hdfs. > > How can I acheive this ? > > Is there any distributed data structure that I can use and keep on > updating it as I traverse the new rows ? > > Regards, > Abhi > -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Explanation regarding Spark Streaming

2016-08-04 Thread Saurav Sinha
Hi, I have query Q1. What will happen if spark streaming job have batchDurationTime as 60 sec and processing time of complete pipeline is greater then 60 sec. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Spark driver getting out of memory

2016-07-20 Thread Saurav Sinha
utilization. Thanks, Saurav Sinha On Tue, Jul 19, 2016 at 10:14 PM, RK Aduri <rkad...@collectivei.com> wrote: > Just want to see if this helps. > > Are you doing heavy collects and persist that? If that is so, you might > want to parallelize that collection by converting to an RDD

Re: Spark driver getting out of memory

2016-07-19 Thread Saurav Sinha
. Thanks, Saurav Sinha On Tue, Jul 19, 2016 at 2:42 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > can you please clarify: > > >1. In what mode are you running the spark standalone, yarn-client, >yarn cluster etc >2. You have 4 nodes with each execu

Re: Spark driver getting out of memory

2016-07-18 Thread Saurav Sinha
, x would be as large as can be set . > > > On Monday, July 18, 2016 6:31 PM, Saurav Sinha <sauravsinh...@gmail.com> > wrote: > > > Hi, > > I am running spark job. > > Master memory - 5G > executor memort 10G(running on 4 node) > > My job is getting kill

Spark driver getting out of memory

2016-07-18 Thread Saurav Sinha
pper.flush(CompressionCodec.scala:197) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1822) Help needed. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Error in Spark job

2016-07-12 Thread Saurav Sinha
) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1426) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: EOFException while reading from HDFS

2016-04-28 Thread Saurav Sinha
and in >> > addition to the three lines above, added the following line to >> > conf/spark-env.sh >> > >> > >> > export SPARK_DIST_CLASSPATH="/usr/local/hadoop-1.0.4/bin/hadoop" >> > >> > >> > but none of it seems to work. However, the following command works from >> > 172.26.49.55 and gives the directory listing: >> > >> > /usr/local/hadoop-1.0.4/bin/hadoop fs -ls hdfs://172.26.49.156:54310/ >> > >> > >> > Any suggestion? >> > >> > >> > Thanks >> > >> > Bibudh >> > >> > >> > -- >> > Bibudh Lahiri >> > Data Scientist, Impetus Technolgoies >> > 5300 Stevens Creek Blvd >> > San Jose, CA 95129 >> > http://knowthynumbers.blogspot.com/ >> > >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > Bibudh Lahiri > Senior Data Scientist, Impetus Technolgoies > 720 University Avenue, Suite 130 > Los Gatos, CA 95129 > http://knowthynumbers.blogspot.com/ > > -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
and Regards, Saurav Sinha Contact: 9742879062

Re: Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
> Thanks > > On Mon, Oct 12, 2015 at 10:07 AM, Saurav Sinha <sauravsinh...@gmail.com> > wrote: > >> Hi Experts, >> >> I am facing issue in which spark job is running infinitely. >> >> When I start spark job on 4 node cluster. >> >> In

Re: Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
Hi Ted, Which monitoring service would you suggest for me. Thanks, Saurav On Mon, Oct 12, 2015 at 11:55 PM, Saurav Sinha <sauravsinh...@gmail.com> wrote: > Hi Ted, > > Which would you suggest for monitoring service for me. > > Thanks, > Saurav > > On Mon, Oc

Re: Spark job is running infinitely

2015-10-12 Thread Saurav Sinha
gt; For the second part, Spark experts may have answer for you. > > On Mon, Oct 12, 2015 at 11:09 AM, Saurav Sinha <sauravsinh...@gmail.com> > wrote: > >> Hi Ted, >> >> *Do you have monitoring put in place to detect 'no space left' scenario ?* >> >> No

Master getting down with Memory issue.

2015-09-28 Thread Saurav Sinha
ore then 5 min to respond status of jobs. Running spark 1.4.1 in standalone mode on 5 machine cluster. Kindly suggest me solution for memory issue it is blocker. Thanks, Saurav Sinha -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Re: Master getting down with Memory issue.

2015-09-28 Thread Saurav Sinha
Hi Akhil, Can you please explaine to me how increasing number of partition (which is thing is worker nodes) will help. As issue is that my master is getting OOM. Thanks, Saurav Sinha On Mon, Sep 28, 2015 at 2:32 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > This behavior totall

Re: Unreachable dead objects permanently retained on heap

2015-09-25 Thread Saurav Sinha
4.1 in standalone mode on 5 machine cluster. Kindly suggest me solution for memory issue it is blocker. Thanks, Saurav Sinha On Fri, Sep 25, 2015 at 5:01 PM, James Aley <james.a...@swiftkey.com> wrote: > Hi, > > We have an application that submits several thousands jobs within the s

Fwd: Issue with high no of skipped task

2015-09-21 Thread Saurav Sinha
Hi Users, I am new Spark I have written flow.When we deployed our code it is completing jobs in 4-5 min. But now it is taking 20+ min in completing with almost same set of data. Can you please help me to figure out reason for it. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Issue with high no of skipped task

2015-09-21 Thread Saurav Sinha
Hi Users, I am new Spark I have written flow.When we deployed our code it is completing jobs in 4-5 min. But now it is taking 20+ min in completing with almost same set of data. Can you please help me to figure out reason for it. -- Thanks and Regards, Saurav Sinha Contact: 9742879062

Fwd: Issue with high no of skipped task

2015-09-21 Thread Saurav Sinha
-- Forwarded message -- From: "Saurav Sinha" <sauravsinh...@gmail.com> Date: 21-Sep-2015 11:48 am Subject: Issue with high no of skipped task To: <user@spark.apache.org> Cc: Hi Users, I am new Spark I have written flow.When we deployed our code it is comp