[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968934#comment-15968934 ] Steve Loughran commented on SPARK-10109: SPARK-20038 should stop the failure being so dramatic > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at java.security.AccessController.doPrivileged(Native Method) > at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933340#comment-15933340 ] Steve Loughran commented on SPARK-10109: This is a bit related to the execution/commit mechanism; in {{FileOutputFormat}}, the write failed, releaseResources() is called. If it throws an exception, then the next time it is called it shouldn't try to close its {{currentWriter}}. > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) >
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933324#comment-15933324 ] Steve Loughran commented on SPARK-10109: I think the cause is actually that in some codepaths, if stream.close() raises an exception, the catch handler calls close() again. This isn't that uncommon; it's general best practise to make sure everything is closed, and relies on two key assumptions # close() doesn't fail # close() is re-entrant, (thread safe is generally not a requirement though, at least it shouldn't be, given the java.io.OutputStream contract) People probably see it with EMR/S3 as a destination for a key reason: close() is when the upload happens, *so is where things go wrong*. Example: HADOOP-14204 That said, it looks like the parquet record writer could be more robust as it doesn't stop re-entrancy if called twice > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at >
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932154#comment-15932154 ] Hyukjin Kwon commented on SPARK-10109: -- Then, I think this should be resolvable when we upgrade Parquet into 1.9.0. Now, this should be Idempotent per PARQUET-544. > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875960#comment-15875960 ] Laurent Hoss commented on SPARK-10109: -- seems related to SPARK-2984 (though closed, the last comment proposed to create a new issue) > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at java.security.AccessController.doPrivileged(Native Method) > at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157526#comment-15157526 ] Yin Huai commented on SPARK-10109: -- InternalParquetRecordWriter's close method can be only called once. I think there is a place where we may call close twice when a task fails. > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156702#comment-15156702 ] Virgil Palanciuc commented on SPARK-10109: -- I get this on spark 1.6.0 (EMR 4.3.0). I'm definitely not writing multithreaded this time - I'm just processing one (relatively large) chunk of data. It seems to be able to recover, so it's not a showstopper... but it is strange if you ask me. {code} java.lang.NullPointerException at org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:147) at org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:113) at org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:112) at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.close(ParquetRelation.scala:101) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.abortTask$1(WriterContainer.scala:294) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:271) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048821#comment-15048821 ] Paul Zaczkieiwcz commented on SPARK-10109: -- Still seeing this in 1.5.1. My use case is the same. I've got several independent spark jobs downloading data and appending to the same partitioned parquet directory in HDFS. None of the partitions overlap but the _temporary folder has become a bottleneck, only allowing one job to append at a time. > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) >
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1501#comment-1501 ] Yin Huai commented on SPARK-10109: -- I see. So, two concurrent jobs are writing to the same location. Then, when they commited job/task, they both touched _temporary dir and they cause the problem. We definitely want to do better for this case. Let's leave this jira open. > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012402#comment-15012402 ] Yin Huai commented on SPARK-10109: -- Is it still a issue? Or, we can resolve this? > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) >
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013088#comment-15013088 ] Virgil Palanciuc commented on SPARK-10109: -- Haven't tried it lately - just ran it with parallel execution disabled. Like I said - root cause seems to be the fact that I was writing 2 dataframes simultaneously in the same destination (albeit with a partitioning scheme that made the destination non-overlapping). I partition by the columns "dpid" and "pid" and if I process 2 PIDs in parallel (on different threads), and they both have the same DPID (e.g I simultaneously write to /dpid=1/pid=1/ and to /dpid=1/pid=2/ ), I get this problem. This seems to be caused by the fact that, on 2 different threads, I do "df.write.partitionBy().parquet('') " - i.e. I write 2 DataFrames simultaneously "to the same destination" (due to partitioning, the actual files should never overlap; but it still seems to be a problem). Not sure if this was really a Spark bug or it's an application problem - if you think Spark should've worked in this scenario, let me know and I'll retry it. But it kinda' feels it was really an application bug (bad assumption on my part about how writing works) > NPE when saving Parquet To HDFS > --- > > Key: SPARK-10109 > URL: https://issues.apache.org/jira/browse/SPARK-10109 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.1 > Environment: Sparc-ec2, standalone cluster on amazon >Reporter: Virgil Palanciuc > > Very simple code, trying to save a dataframe > I get this in the driver > {quote} > 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID > 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) > and (not for that task): > 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID > 5607, 172.yy.yy.yy): java.lang.NullPointerException > at > parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) > at > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) > at > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > at > org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) > at > org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at > org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {quote} > I get this in the executor log: > {quote} > 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet > File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not > have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > at >
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704710#comment-14704710 ] Virgil Palanciuc commented on SPARK-10109: -- I think I know what caused this - the code I was using above was in a parallel foreach, and the outputPath was always the same; I thought there shouldn't be a problem since the pids were always different (and the write is partitioned by pid), but I guess some metadata/temporary files were not? After removing the parallel iteration on 'pid', it works fine. NPE when saving Parquet To HDFS --- Key: SPARK-10109 URL: https://issues.apache.org/jira/browse/SPARK-10109 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.1 Environment: Sparc-ec2, standalone cluster on amazon Reporter: Virgil Palanciuc Very simple code, trying to save a dataframe I get this in the driver {quote} 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) and (not for that task): 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID 5607, 172.yy.yy.yy): java.lang.NullPointerException at parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) at parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) at parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) at org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {quote} I get this in the executor log: {quote} 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at
[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702917#comment-14702917 ] Virgil Palanciuc commented on SPARK-10109: -- Full version details (if it matters): {quote} Spark 1.4.1 built for Hadoop 2.0.0-mr1-cdh4.2.0 Build flags: -Psparkr -Phadoop-1 -Phive -Phive-thriftserver -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DzincPort=3032 {quote} NPE when saving Parquet To HDFS --- Key: SPARK-10109 URL: https://issues.apache.org/jira/browse/SPARK-10109 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.1 Environment: Sparc-ec2, standalone cluster on amazon Reporter: Virgil Palanciuc Very simple code, trying to save a dataframe I get this in the driver {quote} 15/08/19 11:21:41 INFO TaskSetManager: Lost task 9.2 in stage 217.0 (TID 4748) on executor 172.xx.xx.xx: java.lang.NullPointerException (null) and (not for that task): 15/08/19 11:21:46 WARN TaskSetManager: Lost task 5.0 in stage 543.0 (TID 5607, 172.yy.yy.yy): java.lang.NullPointerException at parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:146) at parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112) at parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) at org.apache.spark.sql.parquet.ParquetOutputWriter.close(newParquet.scala:88) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer$$anonfun$clearOutputWriters$1.apply(commands.scala:536) at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:107) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:107) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer.clearOutputWriters(commands.scala:536) at org.apache.spark.sql.sources.DynamicPartitionWriterContainer.abortTask(commands.scala:552) at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$2(commands.scala:269) at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insertWithDynamicPartitions$3.apply(commands.scala:229) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {quote} I get this in the executor log: {quote} 15/08/19 11:21:41 WARN DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /gglogs/2015-07-27/_temporary/_attempt_201508191119_0217_m_09_2/dpid=18432/pid=1109/part-r-9-46ac3a79-a95c-4d9c-a2f1-b3ee76f6a46c.snappy.parquet File does not exist. Holder DFSClient_NONMAPREDUCE_1730998114_63 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at