Hi Arkadiusz Do you have any suggestions?
As an engineer I think when I get disk full errors I want the application to terminate. Its a lot easier for ops to really there is a problem. Andy From: Arkadiusz Bicz <arkadiusz.b...@gmail.com> Date: Friday, February 12, 2016 at 1:57 AM To: Andrew Davidson <a...@santacruzintegration.com> Cc: "user @spark" <user@spark.apache.org> Subject: Re: best practices? spark streaming writing output detecting disk full error > Hi, > > You need good monitoring tools to send you alarms about disk, network > or applications errors, but I think it is general dev ops work not > very specific to spark or hadoop. > > BR, > > Arkadiusz Bicz > https://www.linkedin.com/in/arkadiuszbicz > > On Thu, Feb 11, 2016 at 7:09 PM, Andy Davidson > <a...@santacruzintegration.com> wrote: >> We recently started a Spark/Spark Streaming POC. We wrote a simple streaming >> app in java to collect tweets. We choose twitter because we new we get a lot >> of data and probably lots of burst. Good for stress testing >> >> We spun up a couple of small clusters using the spark-ec2 script. In one >> cluster we wrote all the tweets to HDFS in a second cluster we write all the >> tweets to S3 >> >> We were surprised that our HDFS file system reached 100 % of capacity in a >> few days. This resulted with ³all data nodes dead². We where surprised >> because the actually stream app continued to run. We had no idea we had a >> problem until a day or two after the disk became full when we noticed we >> where missing a lot of data. >> >> We ran into a similar problem with our s3 cluster. We had a permission >> problem and where un able to write any data yet our stream app continued to >> run >> >> >> Spark generated mountains of logs,We are using the stand alone cluster >> manager. All the log levels wind up in the ³error² log. Making it hard to >> find real errors and warnings using the web UI. Our app is written in Java >> so my guess is the write errors must be unable. I.E. We did not know in >> advance that they could occur . They are basically undocumented. >> >> >> >> We are a small shop. Running something like splunk would add a lot of >> expense and complexity for us at this stage of our growth. >> >> What are best practices >> >> Kind Regards >> >> Andy > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >