Can you please post the line of code that is doing the df.write command? On Tue, Sep 20, 2016 at 9:29 AM, Sankar Mittapally < sankar.mittapa...@creditvidya.com> wrote:
> Hey Kevin, > > It is a empty directory, It is able to write part files to the directory > but while merging those part files we are getting above error. > > Regards > > > On Tue, Sep 20, 2016 at 7:46 PM, Kevin Mellott <kevin.r.mell...@gmail.com> > wrote: > >> Have you checked to see if any files already exist at >> /nfspartition/sankar/banking_l1_v2.csv? If so, you will need to delete >> them before attempting to save your DataFrame to that location. >> Alternatively, you may be able to specify the "mode" setting of the >> df.write operation to "overwrite", depending on the version of Spark you >> are running. >> >> *ERROR (from log)* >> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >> _201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444 >> e-9110-510978eaaecb.csv.crc]: >> it still exists. >> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >> _201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444 >> e-9110-510978eaaecb.csv]: >> it still exists. >> >> *df.write Documentation* >> http://spark.apache.org/docs/latest/api/R/write.df.html >> >> Thanks, >> Kevin >> >> On Tue, Sep 20, 2016 at 12:16 AM, sankarmittapally < >> sankar.mittapa...@creditvidya.com> wrote: >> >>> We have setup a spark cluster which is on NFS shared storage, there is >>> no >>> permission issues with NFS storage, all the users are able to write to >>> NFS >>> storage. When I fired write.df command in SparkR, I am getting below. Can >>> some one please help me to fix this issue. >>> >>> >>> 16/09/17 08:03:28 ERROR InsertIntoHadoopFsRelationCommand: Aborting job. >>> java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus >>> {path=file:/nfspartition/sankar/banking_l1_v2.csv/_temporary >>> /0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-249 >>> 0-444e-9110-510978eaaecb.csv; >>> isDirectory=false; length=436486316; replication=1; blocksize=33554432; >>> modification_time=1474099400000; access_time=0; owner=; group=; >>> permission=rw-rw-rw-; isSymlink=false} >>> to >>> file:/nfspartition/sankar/banking_l1_v2.csv/part-r-00000-46a >>> 7f178-2490-444e-9110-510978eaaecb.csv >>> at >>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.m >>> ergePaths(FileOutputCommitter.java:371) >>> at >>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.m >>> ergePaths(FileOutputCommitter.java:384) >>> at >>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.c >>> ommitJob(FileOutputCommitter.java:326) >>> at >>> org.apache.spark.sql.execution.datasources.BaseWriterContain >>> er.commitJob(WriterContainer.scala:222) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoo >>> pFsRelationCommand.scala:144) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>> tionCommand.scala:115) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>> tionCommand.scala:115) >>> at >>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutio >>> nId(SQLExecution.scala:57) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115) >>> at >>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>> ideEffectResult$lzycompute(commands.scala:60) >>> at >>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>> ideEffectResult(commands.scala:58) >>> at >>> org.apache.spark.sql.execution.command.ExecutedCommandExec.d >>> oExecute(commands.scala:74) >>> at >>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1. >>> apply(SparkPlan.scala:115) >>> at >>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1. >>> apply(SparkPlan.scala:115) >>> at >>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQue >>> ry$1.apply(SparkPlan.scala:136) >>> at >>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperati >>> onScope.scala:151) >>> at >>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkP >>> lan.scala:133) >>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) >>> at >>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompu >>> te(QueryExecution.scala:86) >>> at >>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExe >>> cution.scala:86) >>> at >>> org.apache.spark.sql.execution.datasources.DataSource.write( >>> DataSource.scala:487) >>> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211) >>> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>> ssorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>> thodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:498) >>> at >>> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBac >>> kendHandler.scala:141) >>> at >>> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackend >>> Handler.scala:86) >>> at >>> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackend >>> Handler.scala:38) >>> at >>> io.netty.channel.SimpleChannelInboundHandler.channelRead(Sim >>> pleChannelInboundHandler.java:105) >>> at >>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>> Read(AbstractChannelHandlerContext.java:308) >>> at >>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>> ad(AbstractChannelHandlerContext.java:294) >>> at >>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(M >>> essageToMessageDecoder.java:103) >>> at >>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>> Read(AbstractChannelHandlerContext.java:308) >>> at >>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>> ad(AbstractChannelHandlerContext.java:294) >>> at >>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(Byte >>> ToMessageDecoder.java:244) >>> at >>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>> Read(AbstractChannelHandlerContext.java:308) >>> at >>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>> ad(AbstractChannelHandlerContext.java:294) >>> at >>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(Defa >>> ultChannelPipeline.java:846) >>> at >>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.re >>> ad(AbstractNioByteChannel.java:131) >>> at >>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEven >>> tLoop.java:511) >>> at >>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimiz >>> ed(NioEventLoop.java:468) >>> at >>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEve >>> ntLoop.java:382) >>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) >>> at >>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>> gleThreadEventExecutor.java:111) >>> at >>> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnabl >>> eDecorator.run(DefaultThreadFactory.java:137) >>> at java.lang.Thread.run(Thread.java:745) >>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>> _201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444 >>> e-9110-510978eaaecb.csv.crc]: >>> it still exists. >>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>> _201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444 >>> e-9110-510978eaaecb.csv]: >>> it still exists. >>> 16/09/17 08:03:28 ERROR DefaultWriterContainer: Job job_201609170803_0000 >>> aborted. >>> 16/09/17 08:03:28 ERROR RBackendHandler: save on 625 failed >>> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : >>> org.apache.spark.SparkException: Job aborted. >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoo >>> pFsRelationCommand.scala:149) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>> tionCommand.scala:115) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>> tionCommand.scala:115) >>> at >>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutio >>> nId(SQLExecution.scala:57) >>> at >>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>> sRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115) >>> at >>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>> ideEffectResult$lzycompute(commands.scala:60) >>> at >>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>> ideEffectResult(commands.scala:58) >>> at org.apache.spark.sql.execution.command.ExecutedCommandExec.doE >>> >>> >>> >>> >>> -- >>> View this message in context: http://apache-spark-user-list. >>> 1001560.n3.nabble.com/write-df-is-failing-on-Spark-Cluster-tp27761.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>> >> > > > -- > Regards > > Sankar Mittapally > Senior Software Engineer >