Re: Spark Writing to parquet directory : java.io.IOException: Disk quota exceeded
The error message seems self-explanatory, try to figure out what's the disk quota you have for your user. On Wed, Nov 22, 2017 at 8:23 AM, Chetan Khatri wrote: > Anybody reply on this ? > > On Tue, Nov 21, 2017 at 3:36 PM, Chetan Khatri < > chetan.opensou...@gmail.com> wrote: > >> >> Hello Spark Users, >> >> I am getting below error, when i am trying to write dataset to parquet >> location. I have enough disk space available. Last time i was facing same >> kind of error which were resolved by increasing number of cores at hyper >> parameters. Currently result set data size is almost 400Gig with below >> hyper parameters >> >> Driver memory: 4g >> Executor Memory: 16g >> Executor cores=12 >> num executors= 8 >> >> Still it's failing, any Idea ? that if i increase executor memory and >> number of executors. it could get resolved ? >> >> >> 17/11/21 04:29:37 ERROR storage.DiskBlockObjectWriter: Uncaught exception >> while reverting partial writes to file /mapr/chetan/local/david.com/t >> mp/hadoop/nm-local-dir/usercache/david-khurana/appcache/ >> application_1509639363072_10572/blockmgr-008604e6-37cb- >> 421f-8cc5-e94db75684e7/12/temp_shuffle_ae885911-a1ef- >> 404f-9a6a-ded544bb5b3c >> java.io.IOException: Disk quota exceeded >> at java.io.FileOutputStream.close0(Native Method) >> at java.io.FileOutputStream.access$000(FileOutputStream.java:53) >> at java.io.FileOutputStream$1.close(FileOutputStream.java:356) >> at java.io.FileDescriptor.closeAll(FileDescriptor.java:212) >> at java.io.FileOutputStream.close(FileOutputStream.java:354) >> at org.apache.spark.storage.TimeTrackingOutputStream.close(Time >> TrackingOutputStream.java:72) >> at java.io.FilterOutputStream.close(FilterOutputStream.java:159) >> at net.jpountz.lz4.LZ4BlockOutputStream.close(LZ4BlockOutputStr >> eam.java:178) >> at java.io.FilterOutputStream.close(FilterOutputStream.java:159) >> at java.io.FilterOutputStream.close(FilterOutputStream.java:159) >> at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$ >> anon$2.close(UnsafeRowSerializer.scala:96) >> at org.apache.spark.storage.DiskBlockObjectWriter$$anonfun$ >> close$2.apply$mcV$sp(DiskBlockObjectWriter.scala:108) >> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala: >> 1316) >> at org.apache.spark.storage.DiskBlockObjectWriter.close(DiskBlo >> ckObjectWriter.scala:107) >> at org.apache.spark.storage.DiskBlockObjectWriter.revertPartial >> WritesAndClose(DiskBlockObjectWriter.scala:159) >> at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.s >> top(BypassMergeSortShuffleWriter.java:234) >> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap >> Task.scala:85) >> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap >> Task.scala:47) >> at org.apache.spark.scheduler.Task.run(Task.scala:86) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor. >> scala:274) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1142) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> 17/11/21 04:29:37 WARN netty.OneWayOutboxMessage: Failed to send one-way >> RPC. >> java.io.IOException: Failed to connect to /192.168.123.43:58889 >> at org.apache.spark.network.client.TransportClientFactory.creat >> eClient(TransportClientFactory.java:228) >> at org.apache.spark.network.client.TransportClientFactory.creat >> eClient(TransportClientFactory.java:179) >> at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpc >> Env.scala:197) >> at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala: >> 191) >> at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala: >> 187) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >> Executor.java:1142) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >> lExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.net.ConnectException: Connection refused: / >> 192.168.123.43:58889 >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl >> .java:717) >> at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect >> (NioSocketChannel.java:224) >> at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fi >> nishConnect(AbstractNioChannel.java:289) >> at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEven >> tLoop.java:528) >> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimiz >> ed(NioEventLoop.java:468) >> at io.netty.channel.nio.NioEventLoop.processSelecte
Re: Spark Writing to parquet directory : java.io.IOException: Disk quota exceeded
Anybody reply on this ? On Tue, Nov 21, 2017 at 3:36 PM, Chetan Khatri wrote: > > Hello Spark Users, > > I am getting below error, when i am trying to write dataset to parquet > location. I have enough disk space available. Last time i was facing same > kind of error which were resolved by increasing number of cores at hyper > parameters. Currently result set data size is almost 400Gig with below > hyper parameters > > Driver memory: 4g > Executor Memory: 16g > Executor cores=12 > num executors= 8 > > Still it's failing, any Idea ? that if i increase executor memory and > number of executors. it could get resolved ? > > > 17/11/21 04:29:37 ERROR storage.DiskBlockObjectWriter: Uncaught exception > while reverting partial writes to file /mapr/chetan/local/david.com/ > tmp/hadoop/nm-local-dir/usercache/david-khurana/appcache/application_ > 1509639363072_10572/blockmgr-008604e6-37cb-421f-8cc5- > e94db75684e7/12/temp_shuffle_ae885911-a1ef-404f-9a6a-ded544bb5b3c > java.io.IOException: Disk quota exceeded > at java.io.FileOutputStream.close0(Native Method) > at java.io.FileOutputStream.access$000(FileOutputStream.java:53) > at java.io.FileOutputStream$1.close(FileOutputStream.java:356) > at java.io.FileDescriptor.closeAll(FileDescriptor.java:212) > at java.io.FileOutputStream.close(FileOutputStream.java:354) > at org.apache.spark.storage.TimeTrackingOutputStream.close( > TimeTrackingOutputStream.java:72) > at java.io.FilterOutputStream.close(FilterOutputStream.java:159) > at net.jpountz.lz4.LZ4BlockOutputStream.close( > LZ4BlockOutputStream.java:178) > at java.io.FilterOutputStream.close(FilterOutputStream.java:159) > at java.io.FilterOutputStream.close(FilterOutputStream.java:159) > at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$ > anon$2.close(UnsafeRowSerializer.scala:96) > at org.apache.spark.storage.DiskBlockObjectWriter$$ > anonfun$close$2.apply$mcV$sp(DiskBlockObjectWriter.scala:108) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils. > scala:1316) > at org.apache.spark.storage.DiskBlockObjectWriter.close( > DiskBlockObjectWriter.scala:107) > at org.apache.spark.storage.DiskBlockObjectWriter. > revertPartialWritesAndClose(DiskBlockObjectWriter.scala:159) > at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter. > stop(BypassMergeSortShuffleWriter.java:234) > at org.apache.spark.scheduler.ShuffleMapTask.runTask( > ShuffleMapTask.scala:85) > at org.apache.spark.scheduler.ShuffleMapTask.runTask( > ShuffleMapTask.scala:47) > at org.apache.spark.scheduler.Task.run(Task.scala:86) > at org.apache.spark.executor.Executor$TaskRunner.run( > Executor.scala:274) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 17/11/21 04:29:37 WARN netty.OneWayOutboxMessage: Failed to send one-way > RPC. > java.io.IOException: Failed to connect to /192.168.123.43:58889 > at org.apache.spark.network.client.TransportClientFactory. > createClient(TransportClientFactory.java:228) > at org.apache.spark.network.client.TransportClientFactory. > createClient(TransportClientFactory.java:179) > at org.apache.spark.rpc.netty.NettyRpcEnv.createClient( > NettyRpcEnv.scala:197) > at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox. > scala:191) > at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox. > scala:187) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.ConnectException: Connection refused: / > 192.168.123.43:58889 > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect( > SocketChannelImpl.java:717) > at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect( > NioSocketChannel.java:224) > at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe. > finishConnect(AbstractNioChannel.java:289) > at io.netty.channel.nio.NioEventLoop.processSelectedKey( > NioEventLoop.java:528) > at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized( > NioEventLoop.java:468) > at io.netty.channel.nio.NioEventLoop.processSelectedKeys( > NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at io.netty.util.concurrent.SingleThreadEventExecutor$2. > run(SingleThreadEventExecutor.java:111) > ... 1 more >
Spark Writing to parquet directory : java.io.IOException: Disk quota exceeded
Hello Spark Users, I am getting below error, when i am trying to write dataset to parquet location. I have enough disk space available. Last time i was facing same kind of error which were resolved by increasing number of cores at hyper parameters. Currently result set data size is almost 400Gig with below hyper parameters Driver memory: 4g Executor Memory: 16g Executor cores=12 num executors= 8 Still it's failing, any Idea ? that if i increase executor memory and number of executors. it could get resolved ? 17/11/21 04:29:37 ERROR storage.DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /mapr/chetan/local/ david.com/tmp/hadoop/nm-local-dir/usercache/david-khurana/appcache/application_1509639363072_10572/blockmgr-008604e6-37cb-421f-8cc5-e94db75684e7/12/temp_shuffle_ae885911-a1ef-404f-9a6a-ded544bb5b3c java.io.IOException: Disk quota exceeded at java.io.FileOutputStream.close0(Native Method) at java.io.FileOutputStream.access$000(FileOutputStream.java:53) at java.io.FileOutputStream$1.close(FileOutputStream.java:356) at java.io.FileDescriptor.closeAll(FileDescriptor.java:212) at java.io.FileOutputStream.close(FileOutputStream.java:354) at org.apache.spark.storage.TimeTrackingOutputStream.close(TimeTrackingOutputStream.java:72) at java.io.FilterOutputStream.close(FilterOutputStream.java:159) at net.jpountz.lz4.LZ4BlockOutputStream.close(LZ4BlockOutputStream.java:178) at java.io.FilterOutputStream.close(FilterOutputStream.java:159) at java.io.FilterOutputStream.close(FilterOutputStream.java:159) at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2.close(UnsafeRowSerializer.scala:96) at org.apache.spark.storage.DiskBlockObjectWriter$$anonfun$close$2.apply$mcV$sp(DiskBlockObjectWriter.scala:108) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1316) at org.apache.spark.storage.DiskBlockObjectWriter.close(DiskBlockObjectWriter.scala:107) at org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(DiskBlockObjectWriter.scala:159) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.stop(BypassMergeSortShuffleWriter.java:234) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 17/11/21 04:29:37 WARN netty.OneWayOutboxMessage: Failed to send one-way RPC. java.io.IOException: Failed to connect to /192.168.123.43:58889 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused: / 192.168.123.43:58889 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) ... 1 more