Spark version plz ? On 21 September 2016 at 09:46, Sankar Mittapally < sankar.mittapa...@creditvidya.com> wrote:
> Yeah I can do all operations on that folder > > On Sep 21, 2016 12:15 AM, "Kevin Mellott" <kevin.r.mell...@gmail.com> > wrote: > >> Are you able to manually delete the folder below? I'm wondering if there >> is some sort of non-Spark factor involved (permissions, etc). >> >> /nfspartition/sankar/banking_l1_v2.csv >> >> On Tue, Sep 20, 2016 at 12:19 PM, Sankar Mittapally < >> sankar.mittapa...@creditvidya.com> wrote: >> >>> I used that one also >>> >>> On Sep 20, 2016 10:44 PM, "Kevin Mellott" <kevin.r.mell...@gmail.com> >>> wrote: >>> >>>> Instead of *mode="append"*, try *mode="overwrite"* >>>> >>>> On Tue, Sep 20, 2016 at 11:30 AM, Sankar Mittapally < >>>> sankar.mittapa...@creditvidya.com> wrote: >>>> >>>>> Please find the code below. >>>>> >>>>> sankar2 <- read.df("/nfspartition/sankar/test/2016/08/test.json") >>>>> >>>>> I tried these two commands. >>>>> write.df(sankar2,"/nfspartition/sankar/test/test.csv","csv", >>>>> header="true") >>>>> >>>>> saveDF(sankar2,"sankartest.csv",source="csv",mode="append",schema="true") >>>>> >>>>> >>>>> >>>>> On Tue, Sep 20, 2016 at 9:40 PM, Kevin Mellott < >>>>> kevin.r.mell...@gmail.com> wrote: >>>>> >>>>>> Can you please post the line of code that is doing the df.write >>>>>> command? >>>>>> >>>>>> On Tue, Sep 20, 2016 at 9:29 AM, Sankar Mittapally < >>>>>> sankar.mittapa...@creditvidya.com> wrote: >>>>>> >>>>>>> Hey Kevin, >>>>>>> >>>>>>> It is a empty directory, It is able to write part files to the >>>>>>> directory but while merging those part files we are getting above error. >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 20, 2016 at 7:46 PM, Kevin Mellott < >>>>>>> kevin.r.mell...@gmail.com> wrote: >>>>>>> >>>>>>>> Have you checked to see if any files already exist at >>>>>>>> /nfspartition/sankar/banking_l1_v2.csv? If so, you will need to >>>>>>>> delete them before attempting to save your DataFrame to that location. >>>>>>>> Alternatively, you may be able to specify the "mode" setting of the >>>>>>>> df.write operation to "overwrite", depending on the version of Spark >>>>>>>> you >>>>>>>> are running. >>>>>>>> >>>>>>>> *ERROR (from log)* >>>>>>>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>>>>>>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>>>>>>> _201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444e >>>>>>>> -9110-510978eaaecb.csv.crc]: >>>>>>>> it still exists. >>>>>>>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>>>>>>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>>>>>>> _201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e- >>>>>>>> 9110-510978eaaecb.csv]: >>>>>>>> it still exists. >>>>>>>> >>>>>>>> *df.write Documentation* >>>>>>>> http://spark.apache.org/docs/latest/api/R/write.df.html >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Kevin >>>>>>>> >>>>>>>> On Tue, Sep 20, 2016 at 12:16 AM, sankarmittapally < >>>>>>>> sankar.mittapa...@creditvidya.com> wrote: >>>>>>>> >>>>>>>>> We have setup a spark cluster which is on NFS shared storage, >>>>>>>>> there is no >>>>>>>>> permission issues with NFS storage, all the users are able to >>>>>>>>> write to NFS >>>>>>>>> storage. When I fired write.df command in SparkR, I am getting >>>>>>>>> below. Can >>>>>>>>> some one please help me to fix this issue. >>>>>>>>> >>>>>>>>> >>>>>>>>> 16/09/17 08:03:28 ERROR InsertIntoHadoopFsRelationCommand: >>>>>>>>> Aborting job. >>>>>>>>> java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus >>>>>>>>> {path=file:/nfspartition/sankar/banking_l1_v2.csv/_temporary >>>>>>>>> /0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-249 >>>>>>>>> 0-444e-9110-510978eaaecb.csv; >>>>>>>>> isDirectory=false; length=436486316; replication=1; >>>>>>>>> blocksize=33554432; >>>>>>>>> modification_time=1474099400000; access_time=0; owner=; group=; >>>>>>>>> permission=rw-rw-rw-; isSymlink=false} >>>>>>>>> to >>>>>>>>> file:/nfspartition/sankar/banking_l1_v2.csv/part-r-00000-46a >>>>>>>>> 7f178-2490-444e-9110-510978eaaecb.csv >>>>>>>>> at >>>>>>>>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.m >>>>>>>>> ergePaths(FileOutputCommitter.java:371) >>>>>>>>> at >>>>>>>>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.m >>>>>>>>> ergePaths(FileOutputCommitter.java:384) >>>>>>>>> at >>>>>>>>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.c >>>>>>>>> ommitJob(FileOutputCommitter.java:326) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.BaseWriterContain >>>>>>>>> er.commitJob(WriterContainer.scala:222) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoo >>>>>>>>> pFsRelationCommand.scala:144) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>>>>>>> tionCommand.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>>>>>>> tionCommand.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutio >>>>>>>>> nId(SQLExecution.scala:57) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>>>>>>> ideEffectResult$lzycompute(commands.scala:60) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>>>>>>> ideEffectResult(commands.scala:58) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.d >>>>>>>>> oExecute(commands.scala:74) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1. >>>>>>>>> apply(SparkPlan.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1. >>>>>>>>> apply(SparkPlan.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQue >>>>>>>>> ry$1.apply(SparkPlan.scala:136) >>>>>>>>> at >>>>>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperati >>>>>>>>> onScope.scala:151) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkP >>>>>>>>> lan.scala:133) >>>>>>>>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.s >>>>>>>>> cala:114) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompu >>>>>>>>> te(QueryExecution.scala:86) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExe >>>>>>>>> cution.scala:86) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.DataSource.write( >>>>>>>>> DataSource.scala:487) >>>>>>>>> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.sc >>>>>>>>> ala:211) >>>>>>>>> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.sc >>>>>>>>> ala:194) >>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>>> at >>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>>>>>>>> ssorImpl.java:62) >>>>>>>>> at >>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>>>>>>>> thodAccessorImpl.java:43) >>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:498) >>>>>>>>> at >>>>>>>>> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBac >>>>>>>>> kendHandler.scala:141) >>>>>>>>> at >>>>>>>>> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackend >>>>>>>>> Handler.scala:86) >>>>>>>>> at >>>>>>>>> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackend >>>>>>>>> Handler.scala:38) >>>>>>>>> at >>>>>>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(Sim >>>>>>>>> pleChannelInboundHandler.java:105) >>>>>>>>> at >>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>>>>>>>> Read(AbstractChannelHandlerContext.java:308) >>>>>>>>> at >>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>>>>>>>> ad(AbstractChannelHandlerContext.java:294) >>>>>>>>> at >>>>>>>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(M >>>>>>>>> essageToMessageDecoder.java:103) >>>>>>>>> at >>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>>>>>>>> Read(AbstractChannelHandlerContext.java:308) >>>>>>>>> at >>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>>>>>>>> ad(AbstractChannelHandlerContext.java:294) >>>>>>>>> at >>>>>>>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(Byte >>>>>>>>> ToMessageDecoder.java:244) >>>>>>>>> at >>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>>>>>>>> Read(AbstractChannelHandlerContext.java:308) >>>>>>>>> at >>>>>>>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>>>>>>>> ad(AbstractChannelHandlerContext.java:294) >>>>>>>>> at >>>>>>>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(Defa >>>>>>>>> ultChannelPipeline.java:846) >>>>>>>>> at >>>>>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.re >>>>>>>>> ad(AbstractNioByteChannel.java:131) >>>>>>>>> at >>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEven >>>>>>>>> tLoop.java:511) >>>>>>>>> at >>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimiz >>>>>>>>> ed(NioEventLoop.java:468) >>>>>>>>> at >>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEve >>>>>>>>> ntLoop.java:382) >>>>>>>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) >>>>>>>>> at >>>>>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>>>>>>>> gleThreadEventExecutor.java:111) >>>>>>>>> at >>>>>>>>> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnabl >>>>>>>>> eDecorator.run(DefaultThreadFactory.java:137) >>>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>>>>>>>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>>>>>>>> _201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444e >>>>>>>>> -9110-510978eaaecb.csv.crc]: >>>>>>>>> it still exists. >>>>>>>>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>>>>>>>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>>>>>>>> _201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e- >>>>>>>>> 9110-510978eaaecb.csv]: >>>>>>>>> it still exists. >>>>>>>>> 16/09/17 08:03:28 ERROR DefaultWriterContainer: Job >>>>>>>>> job_201609170803_0000 >>>>>>>>> aborted. >>>>>>>>> 16/09/17 08:03:28 ERROR RBackendHandler: save on 625 failed >>>>>>>>> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : >>>>>>>>> org.apache.spark.SparkException: Job aborted. >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoo >>>>>>>>> pFsRelationCommand.scala:149) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>>>>>>> tionCommand.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>>>>>>> tionCommand.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutio >>>>>>>>> nId(SQLExecution.scala:57) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>>>>>>> sRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>>>>>>> ideEffectResult$lzycompute(commands.scala:60) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>>>>>>> ideEffectResult(commands.scala:58) >>>>>>>>> at org.apache.spark.sql.execution.command.ExecutedCommandExec.doE >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> View this message in context: http://apache-spark-user-list. >>>>>>>>> 1001560.n3.nabble.com/write-df-is-failing-on-Spark-Cluster-t >>>>>>>>> p27761.html >>>>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>>>> Nabble.com. >>>>>>>>> >>>>>>>>> ------------------------------------------------------------ >>>>>>>>> --------- >>>>>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Regards >>>>>>> >>>>>>> Sankar Mittapally >>>>>>> Senior Software Engineer >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Regards >>>>> >>>>> Sankar Mittapally >>>>> Senior Software Engineer >>>>> >>>> >>>> >>