[jira] [Commented] (SPARK-6384) saveAsParquet doesn't clean up attempt_* folders

2017-03-14 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925617#comment-15925617
 ] 

Takeshi Yamamuro commented on SPARK-6384:
-

Since this ticket is almost inactive and the related code has totally changed 
(At least, SchemaRDD's gone), I'll close this. If you have any problem, you 
feel free to update description and reopen this. Thanks!

> saveAsParquet doesn't clean up attempt_* folders
> 
>
> Key: SPARK-6384
> URL: https://issues.apache.org/jira/browse/SPARK-6384
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Rex Xiong
>
> After calling SchemaRDD.saveAsParquet, it runs well and generate *.parquet, 
> _SUCCESS, _common_metadata, _metadata files successfully.
> But sometimes, there will be some attempt_* folder (e.g. 
> attempt_201503170229_0006_r_06_736, 
> attempt_201503170229_0006_r_000404_416) under the same folder, it contains 
> one parquet file, seems to be a working temp folder.
> It happens even though _SUCCESS file created.
> In this situation, SparkSQL (Hive table) throws exception when loading this 
> parquet folder:
> Error: java.io.FileNotFoundException: Path is not a file: 
> ../attempt_201503170229_0006_r_06_736
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.ja
> va:69)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.ja
> va:55)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> UpdateTimes(FSNamesystem.java:1728)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> Int(FSNamesystem.java:1671)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> (FSNamesystem.java:1651)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> (FSNamesystem.java:1625)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLoca
> tions(NameNodeRpcServer.java:503)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTra
> nslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:32
> 2)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Cl
> ientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.cal
> l(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1594)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) 
> (state=,co
> de=0)
> I'm not sure whether it's a Spark bug or a Parquet bug.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6384) saveAsParquet doesn't clean up attempt_* folders

2015-08-05 Thread Thomas Dudziak (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659510#comment-14659510
 ] 

Thomas Dudziak commented on SPARK-6384:
---

I have seen this with Orc as well (using "INSERT INTO TABLE") in 1.4.1.

> saveAsParquet doesn't clean up attempt_* folders
> 
>
> Key: SPARK-6384
> URL: https://issues.apache.org/jira/browse/SPARK-6384
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Rex Xiong
>
> After calling SchemaRDD.saveAsParquet, it runs well and generate *.parquet, 
> _SUCCESS, _common_metadata, _metadata files successfully.
> But sometimes, there will be some attempt_* folder (e.g. 
> attempt_201503170229_0006_r_06_736, 
> attempt_201503170229_0006_r_000404_416) under the same folder, it contains 
> one parquet file, seems to be a working temp folder.
> It happens even though _SUCCESS file created.
> In this situation, SparkSQL (Hive table) throws exception when loading this 
> parquet folder:
> Error: java.io.FileNotFoundException: Path is not a file: 
> ../attempt_201503170229_0006_r_06_736
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.ja
> va:69)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.ja
> va:55)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> UpdateTimes(FSNamesystem.java:1728)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> Int(FSNamesystem.java:1671)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> (FSNamesystem.java:1651)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations
> (FSNamesystem.java:1625)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLoca
> tions(NameNodeRpcServer.java:503)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTra
> nslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:32
> 2)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Cl
> ientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.cal
> l(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1594)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) 
> (state=,co
> de=0)
> I'm not sure whether it's a Spark bug or a Parquet bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org