[jira] [Commented] (SPARK-6962) Netty BlockTransferService hangs in the middle of SQL query
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633016#comment-14633016 ] Jon Chase commented on SPARK-6962: -- I'll check tomorrow on 1.4.0. Netty BlockTransferService hangs in the middle of SQL query --- Key: SPARK-6962 URL: https://issues.apache.org/jira/browse/SPARK-6962 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.0, 1.2.1, 1.3.0 Reporter: Jon Chase Attachments: jstacks.txt Spark SQL queries (though this seems to be a Spark Core issue - I'm just using queries in the REPL to surface this, so I mention Spark SQL) hang indefinitely under certain (not totally understood) circumstances. This is resolved by setting spark.shuffle.blockTransferService=nio, which seems to point to netty as the issue. Netty was set as the default for the block transport layer in 1.2.0, which is when this issue started. Setting the service to nio allows queries to complete normally. I do not see this problem when running queries over smaller (~20 5MB files) datasets. When I increase the scope to include more data (several hundred ~5MB files), the queries will get through several steps but eventuall hang indefinitely. Here's the email chain regarding this issue, including stack traces: http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/cae61spfqt2y7d5vqzomzz2dmr-jx2c2zggcyky40npkjjx4...@mail.gmail.com For context, here's the announcement regarding the block transfer service change: http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/cabpqxssl04q+rbltp-d8w+z3atn+g-um6gmdgdnh-hzcvd-...@mail.gmail.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6962) Netty BlockTransferService hangs in the middle of SQL query
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500406#comment-14500406 ] Jon Chase commented on SPARK-6962: -- I'm tailing the executor logs before/as this is happening and I don't see anything out of the ordinary (errors, etc.) Here's what the logs look like when the lockup occurs (again, not seeing anything out of the ordinary). I tailed all executor's, and all of the logs look similar to this. == /mnt/var/log/hadoop/yarn-hadoop-nodemanager-ip-XX-XX-XX-XXX.eu-west-1.compute.internal.log == 2015-04-17 18:27:58,206 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl (Container Monitor): Memory usage of ProcessTree 11216 for container-id container_1429189930421_0012_01_02: 6.7 GB of 10 GB physical memory used; 11.3 GB of 50 GB virtual memory used 2015-04-17 18:28:01,214 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl (Container Monitor): Memory usage of ProcessTree 11216 for container-id container_1429189930421_0012_01_02: 6.7 GB of 10 GB physical memory used; 11.3 GB of 50 GB virtual memory used 2015-04-17 18:28:04,221 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl (Container Monitor): Memory usage of ProcessTree 11216 for container-id container_1429189930421_0012_01_02: 6.7 GB of 10 GB physical memory used; 11.3 GB of 50 GB virtual memory used 2015-04-17 18:28:07,229 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl (Container Monitor): Memory usage of ProcessTree 11216 for container-id container_1429189930421_0012_01_02: 6.7 GB of 10 GB physical memory used; 11.3 GB of 50 GB virtual memory used Netty BlockTransferService hangs in the middle of SQL query --- Key: SPARK-6962 URL: https://issues.apache.org/jira/browse/SPARK-6962 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.0, 1.2.1, 1.3.0 Reporter: Jon Chase Attachments: jstacks.txt Spark SQL queries (though this seems to be a Spark Core issue - I'm just using queries in the REPL to surface this, so I mention Spark SQL) hang indefinitely under certain (not totally understood) circumstances. This is resolved by setting spark.shuffle.blockTransferService=nio, which seems to point to netty as the issue. Netty was set as the default for the block transport layer in 1.2.0, which is when this issue started. Setting the service to nio allows queries to complete normally. I do not see this problem when running queries over smaller (~20 5MB files) datasets. When I increase the scope to include more data (several hundred ~5MB files), the queries will get through several steps but eventuall hang indefinitely. Here's the email chain regarding this issue, including stack traces: http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/cae61spfqt2y7d5vqzomzz2dmr-jx2c2zggcyky40npkjjx4...@mail.gmail.com For context, here's the announcement regarding the block transfer service change: http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/cabpqxssl04q+rbltp-d8w+z3atn+g-um6gmdgdnh-hzcvd-...@mail.gmail.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6962) Netty BlockTransferService hangs in the middle of SQL query
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500419#comment-14500419 ] Jon Chase commented on SPARK-6962: -- Looking at the UI when the lock up occurs, I see that every executor has 4 active tasks. It's not the case that, say, only a single executor has a task running - they all appear to be busy while locked up. Netty BlockTransferService hangs in the middle of SQL query --- Key: SPARK-6962 URL: https://issues.apache.org/jira/browse/SPARK-6962 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.0, 1.2.1, 1.3.0 Reporter: Jon Chase Attachments: jstacks.txt Spark SQL queries (though this seems to be a Spark Core issue - I'm just using queries in the REPL to surface this, so I mention Spark SQL) hang indefinitely under certain (not totally understood) circumstances. This is resolved by setting spark.shuffle.blockTransferService=nio, which seems to point to netty as the issue. Netty was set as the default for the block transport layer in 1.2.0, which is when this issue started. Setting the service to nio allows queries to complete normally. I do not see this problem when running queries over smaller (~20 5MB files) datasets. When I increase the scope to include more data (several hundred ~5MB files), the queries will get through several steps but eventuall hang indefinitely. Here's the email chain regarding this issue, including stack traces: http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/cae61spfqt2y7d5vqzomzz2dmr-jx2c2zggcyky40npkjjx4...@mail.gmail.com For context, here's the announcement regarding the block transfer service change: http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/cabpqxssl04q+rbltp-d8w+z3atn+g-um6gmdgdnh-hzcvd-...@mail.gmail.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6962) Netty BlockTransferService hangs in the middle of SQL query
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500429#comment-14500429 ] Jon Chase commented on SPARK-6962: -- Here's the stderr from the executors at the time of the lock up (there are 3 executors). 18:26:00 is when the lockup happened, and after 20+ minutes, these are still the most recent logs in executor 1: 15/04/17 18:26:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1132 15/04/17 18:26:00 INFO executor.Executor: Running task 110.0 in stage 15.0 (TID 1132) 15/04/17 18:26:00 INFO storage.ShuffleBlockFetcherIterator: Getting 1008 non-empty blocks out of 1008 blocks 15/04/17 18:26:00 INFO storage.ShuffleBlockFetcherIterator: Started 2 remote fetches in 3 ms 15/04/17 18:26:00 INFO executor.Executor: Finished task 107.0 in stage 15.0 (TID 1129). 8325 bytes result sent to driver 15/04/17 18:26:00 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1133 15/04/17 18:26:00 INFO executor.Executor: Running task 111.0 in stage 15.0 (TID 1133) 15/04/17 18:26:00 INFO storage.ShuffleBlockFetcherIterator: Getting 1008 non-empty blocks out of 1008 blocks 15/04/17 18:26:00 INFO storage.ShuffleBlockFetcherIterator: Started 2 remote fetches in 2 ms Here's executor 2, it doesn't have any activity for about 20 minutes (again, the lockup happened at ~18:26:00): 15/04/17 18:25:48 INFO storage.ShuffleBlockFetcherIterator: Getting 1008 non-empty blocks out of 1008 blocks 15/04/17 18:25:48 INFO storage.ShuffleBlockFetcherIterator: Started 2 remote fetches in 11 ms 15/04/17 18:25:49 INFO executor.Executor: Finished task 13.0 in stage 15.0 (TID 1035). 12013 bytes result sent to driver 15/04/17 18:25:49 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1068 15/04/17 18:25:49 INFO executor.Executor: Running task 46.0 in stage 15.0 (TID 1068) 15/04/17 18:25:49 INFO storage.ShuffleBlockFetcherIterator: Getting 1008 non-empty blocks out of 1008 blocks 15/04/17 18:25:49 INFO storage.ShuffleBlockFetcherIterator: Started 2 remote fetches in 16 ms 15/04/17 18:41:19 WARN server.TransportChannelHandler: Exception in connection from /10.106.144.109:49697 java.io.IOException: Connection timed out at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:225) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) 15/04/17 18:41:27 WARN server.TransportChannelHandler: Exception in connection from /10.106.145.10:38473 java.io.IOException: Connection timed out at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:192) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311) at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:225) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at java.lang.Thread.run(Thread.java:745) Same with executor 3: 15/04/17 18:25:52 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1092 15/04/17 18:25:52 INFO
[jira] [Comment Edited] (SPARK-6962) Spark gets stuck on a step, hangs forever - jobs do not complete
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498233#comment-14498233 ] Jon Chase edited comment on SPARK-6962 at 4/16/15 4:17 PM: --- Attaching the stack dumps I took when Spark is hanging. was (Author: jonchase): Here are the stack dumps I took when Spark is hanging. Spark gets stuck on a step, hangs forever - jobs do not complete Key: SPARK-6962 URL: https://issues.apache.org/jira/browse/SPARK-6962 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.0, 1.2.1, 1.3.0 Reporter: Jon Chase Attachments: jstacks.txt Spark SQL queries (though this seems to be a Spark Core issue - I'm just using queries in the REPL to surface this, so I mention Spark SQL) hang indefinitely under certain (not totally understood) circumstances. This is resolved by setting spark.shuffle.blockTransferService=nio, which seems to point to netty as the issue. Netty was set as the default for the block transport layer in 1.2.0, which is when this issue started. Setting the service to nio allows queries to complete normally. I do not see this problem when running queries over smaller (~20 5MB files) datasets. When I increase the scope to include more data (several hundred ~5MB files), the queries will get through several steps but eventuall hang indefinitely. Here's the email chain regarding this issue, including stack traces: http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/cae61spfqt2y7d5vqzomzz2dmr-jx2c2zggcyky40npkjjx4...@mail.gmail.com For context, here's the announcement regarding the block transfer service change: http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/cabpqxssl04q+rbltp-d8w+z3atn+g-um6gmdgdnh-hzcvd-...@mail.gmail.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6962) Spark gets stuck on a step, hangs forever - jobs do not complete
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498232#comment-14498232 ] Jon Chase commented on SPARK-6962: -- I think it's different from SPARK-4395, as calling/omitting .cache() doesn't have any effect. Also, once it hangs, I've never seen it finish (even after waiting many hours). Also different from SPARK-5060, I believe, as the web UI accurately reports the remaining tasks as unfinished (or in progress in case of the ones running when the hang occurs). Here's my original post from the email thread: === Spark 1.3.0 on YARN (Amazon EMR), cluster of 10 m3.2xlarge (8cpu, 30GB), executor memory 20GB, driver memory 10GB I'm using Spark SQL, mainly via spark-shell, to query 15GB of data spread out over roughly 2,000 Parquet files and my queries frequently hang. Simple queries like select count(*) from ... on the entire data set work ok. Slightly more demanding ones with group by's and some aggregate functions (percentile_approx, avg, etc.) work ok as well, as long as I have some criteria in my where clause to keep the number of rows down. Once I hit some limit on query complexity and rows processed, my queries start to hang. I've left them for up to an hour without seeing any progress. No OOM's either - the job is just stuck. I've tried setting spark.sql.shuffle.partitions to 400 and even 800, but with the same results: usually near the end of the tasks (like 780 of 800 complete), progress just stops: 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 788.0 in stage 1.0 (TID 1618) in 800 ms on ip-10-209-22-211.eu-west-1.compute.internal (748/800) 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 793.0 in stage 1.0 (TID 1623) in 622 ms on ip-10-105-12-41.eu-west-1.compute.internal (749/800) 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 797.0 in stage 1.0 (TID 1627) in 616 ms on ip-10-90-2-201.eu-west-1.compute.internal (750/800) 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 799.0 in stage 1.0 (TID 1629) in 611 ms on ip-10-90-2-201.eu-west-1.compute.internal (751/800) 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 795.0 in stage 1.0 (TID 1625) in 669 ms on ip-10-105-12-41.eu-west-1.compute.internal (752/800) ^^^ this is where it stays forever Looking at the Spark UI, several of the executors still list active tasks. I do see that the Shuffle Read for executors that don't have any tasks remaining is around 100MB, whereas it's more like 10MB for the executors that still have tasks. The first stage, mapPartitions, always completes fine. It's the second stage (takeOrdered), that hangs. I've had this issue in 1.2.0 and 1.2.1 as well as 1.3.0. I've also encountered it when using JSON files (instead of Parquet). Spark gets stuck on a step, hangs forever - jobs do not complete Key: SPARK-6962 URL: https://issues.apache.org/jira/browse/SPARK-6962 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.0, 1.2.1, 1.3.0 Reporter: Jon Chase Spark SQL queries (though this seems to be a Spark Core issue - I'm just using queries in the REPL to surface this, so I mention Spark SQL) hang indefinitely under certain (not totally understood) circumstances. This is resolved by setting spark.shuffle.blockTransferService=nio, which seems to point to netty as the issue. Netty was set as the default for the block transport layer in 1.2.0, which is when this issue started. Setting the service to nio allows queries to complete normally. I do not see this problem when running queries over smaller (~20 5MB files) datasets. When I increase the scope to include more data (several hundred ~5MB files), the queries will get through several steps but eventuall hang indefinitely. Here's the email chain regarding this issue, including stack traces: http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/cae61spfqt2y7d5vqzomzz2dmr-jx2c2zggcyky40npkjjx4...@mail.gmail.com For context, here's the announcement regarding the block transfer service change: http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/cabpqxssl04q+rbltp-d8w+z3atn+g-um6gmdgdnh-hzcvd-...@mail.gmail.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6962) Spark gets stuck on a step, hangs forever - jobs do not complete
[ https://issues.apache.org/jira/browse/SPARK-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Chase updated SPARK-6962: - Attachment: jstacks.txt Here are the stack dumps I took when Spark is hanging. Spark gets stuck on a step, hangs forever - jobs do not complete Key: SPARK-6962 URL: https://issues.apache.org/jira/browse/SPARK-6962 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.2.0, 1.2.1, 1.3.0 Reporter: Jon Chase Attachments: jstacks.txt Spark SQL queries (though this seems to be a Spark Core issue - I'm just using queries in the REPL to surface this, so I mention Spark SQL) hang indefinitely under certain (not totally understood) circumstances. This is resolved by setting spark.shuffle.blockTransferService=nio, which seems to point to netty as the issue. Netty was set as the default for the block transport layer in 1.2.0, which is when this issue started. Setting the service to nio allows queries to complete normally. I do not see this problem when running queries over smaller (~20 5MB files) datasets. When I increase the scope to include more data (several hundred ~5MB files), the queries will get through several steps but eventuall hang indefinitely. Here's the email chain regarding this issue, including stack traces: http://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/cae61spfqt2y7d5vqzomzz2dmr-jx2c2zggcyky40npkjjx4...@mail.gmail.com For context, here's the announcement regarding the block transfer service change: http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/cabpqxssl04q+rbltp-d8w+z3atn+g-um6gmdgdnh-hzcvd-...@mail.gmail.com -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-6570) Spark SQL arrays: explode() fails and cannot save array type to Parquet
[ https://issues.apache.org/jira/browse/SPARK-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Chase updated SPARK-6570: - Summary: Spark SQL arrays: explode() fails and cannot save array type to Parquet (was: Spark SQL explode() fails, assumes underlying SQL array is represented by Scala Seq) Spark SQL arrays: explode() fails and cannot save array type to Parquet - Key: SPARK-6570 URL: https://issues.apache.org/jira/browse/SPARK-6570 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Jon Chase {code} @Rule public TemporaryFolder tmp = new TemporaryFolder(); @Test public void testPercentileWithExplode() throws Exception { StructType schema = DataTypes.createStructType(Lists.newArrayList( DataTypes.createStructField(col1, DataTypes.StringType, false), DataTypes.createStructField(col2s, DataTypes.createArrayType(DataTypes.IntegerType, true), true) )); JavaRDDRow rowRDD = sc.parallelize(Lists.newArrayList( RowFactory.create(test, new int[]{1, 2, 3}) )); DataFrame df = sql.createDataFrame(rowRDD, schema); df.registerTempTable(df); df.printSchema(); Listint[] ints = sql.sql(select col2s from df).javaRDD() .map(row - (int[]) row.get(0)).collect(); assertEquals(1, ints.size()); assertArrayEquals(new int[]{1, 2, 3}, ints.get(0)); // fails: lateral view explode does not work: java.lang.ClassCastException: [I cannot be cast to scala.collection.Seq ListInteger explodedInts = sql.sql(select col2 from df lateral view explode(col2s) splode as col2).javaRDD() .map(row - row.getInt(0)).collect(); assertEquals(3, explodedInts.size()); assertEquals(Lists.newArrayList(1, 2, 3), explodedInts); // fails: java.lang.ClassCastException: [I cannot be cast to scala.collection.Seq df.saveAsParquetFile(tmp.getRoot().getAbsolutePath() + /parquet); DataFrame loadedDf = sql.load(tmp.getRoot().getAbsolutePath() + /parquet); loadedDf.registerTempTable(loadedDf); Listint[] moreInts = sql.sql(select col2s from loadedDf).javaRDD() .map(row - (int[]) row.get(0)).collect(); assertEquals(1, moreInts.size()); assertArrayEquals(new int[]{1, 2, 3}, moreInts.get(0)); } {code} {code} root |-- col1: string (nullable = false) |-- col2s: array (nullable = true) ||-- element: integer (containsNull = true) ERROR org.apache.spark.executor.Executor Exception in task 7.0 in stage 1.0 (TID 15) java.lang.ClassCastException: [I cannot be cast to scala.collection.Seq at org.apache.spark.sql.catalyst.expressions.Explode.eval(generators.scala:125) ~[spark-catalyst_2.10-1.3.0.jar:1.3.0] at org.apache.spark.sql.execution.Generate$$anonfun$2$$anonfun$apply$1.apply(Generate.scala:70) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at org.apache.spark.sql.execution.Generate$$anonfun$2$$anonfun$apply$1.apply(Generate.scala:69) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) ~[scala-library-2.10.4.jar:na] at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) ~[scala-library-2.10.4.jar:na] at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) ~[scala-library-2.10.4.jar:na] at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) ~[scala-library-2.10.4.jar:na] at scala.collection.Iterator$class.foreach(Iterator.scala:727) ~[scala-library-2.10.4.jar:na] at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) ~[scala-library-2.10.4.jar:na] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6570) Spark SQL explode() fails, assumes underlying SQL array is represented by Scala Seq
[ https://issues.apache.org/jira/browse/SPARK-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383794#comment-14383794 ] Jon Chase commented on SPARK-6570: -- Stack trace for saveAsParquetFile(): {code} root |-- col1: string (nullable = false) |-- col2s: array (nullable = true) ||-- element: integer (containsNull = true) SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. ERROR org.apache.spark.executor.Executor Exception in task 7.0 in stage 1.0 (TID 15) java.lang.ClassCastException: [I cannot be cast to scala.collection.Seq at org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:185) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120) ~[parquet-hadoop-1.6.0rc3.jar:na] at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81) ~[parquet-hadoop-1.6.0rc3.jar:na] at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37) ~[parquet-hadoop-1.6.0rc3.jar:na] at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:631) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:648) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:648) ~[spark-sql_2.10-1.3.0.jar:1.3.0] at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) ~[spark-core_2.10-1.3.0.jar:1.3.0] at org.apache.spark.scheduler.Task.run(Task.scala:64) ~[spark-core_2.10-1.3.0.jar:1.3.0] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) ~[spark-core_2.10-1.3.0.jar:1.3.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_31] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_31] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31] WARN o.a.spark.scheduler.TaskSetManager Lost task 7.0 in stage 1.0 (TID 15, localhost): java.lang.ClassCastException: [I cannot be cast to scala.collection.Seq at org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:185) at org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171) at org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134) at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120) at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81) at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37) at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:631) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:648) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:648) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ERROR o.a.spark.scheduler.TaskSetManager Task 7 in stage 1.0 failed 1 times; aborting job org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 1.0 failed 1 times, most recent failure: Lost task 7.0 in stage 1.0 (TID 15, localhost): java.lang.ClassCastException: [I cannot be cast to scala.collection.Seq at org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:185) at org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171) at org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134) at parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120) at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81) at
[jira] [Commented] (SPARK-6554) Cannot use partition columns in where clause
[ https://issues.apache.org/jira/browse/SPARK-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381720#comment-14381720 ] Jon Chase commented on SPARK-6554: -- Here's a test case to reproduce the issue: {code} @Test public void testSpark_6554() { // given: DataFrame saveDF = sql.jsonRDD( sc.parallelize(Lists.newArrayList({\col1\: 1})), DataTypes.createStructType(Lists.newArrayList(DataTypes.createStructField(col1, DataTypes.IntegerType, false; // when: saveDF.saveAsParquetFile(tmp.getRoot().getAbsolutePath() + /col2=2); // then: DataFrame loadedDF = sql.load(tmp.getRoot().getAbsolutePath()); assertEquals(1, loadedDF.count()); assertEquals(2, loadedDF.schema().fieldNames().length); assertEquals(col1, loadedDF.schema().fieldNames()[0]); assertEquals(col2, loadedDF.schema().fieldNames()[1]); loadedDF.registerTempTable(df); // this query works Row[] results = sql.sql(select col1, col2 from df).collect(); assertEquals(1, results.length); assertEquals(2, results[0].size()); // this query is broken results = sql.sql(select col1, col2 from df where col2 0).collect(); assertEquals(1, results.length); assertEquals(2, results[0].size()); } {code} Cannot use partition columns in where clause Key: SPARK-6554 URL: https://issues.apache.org/jira/browse/SPARK-6554 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Jon Chase I'm having trouble referencing partition columns in my queries with Parquet. In the following example, 'probeTypeId' is a partition column. For example, the directory structure looks like this: /mydata /probeTypeId=1 ...files... /probeTypeId=2 ...files... I see the column when I reference load a DF using the /mydata directory and call df.printSchema(): ... |-- probeTypeId: integer (nullable = true) ... Parquet is also aware of the column: optional int32 probeTypeId; And this works fine: sqlContext.sql(select probeTypeId from df limit 1); ...as does df.show() - it shows the correct values for the partition column. However, when I try to use a partition column in a where clause, I get an exception stating that the column was not found in the schema: sqlContext.sql(select probeTypeId from df where probeTypeId = 1 limit 1); ... ... org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Column [probeTypeId] was not found in schema! at parquet.Preconditions.checkArgument(Preconditions.java:47) at parquet.filter2.predicate.SchemaCompatibilityValidator.getColumnDescriptor(SchemaCompatibilityValidator.java:172) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:160) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:142) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:76) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:41) at parquet.filter2.predicate.Operators$Eq.accept(Operators.java:162) at parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:46) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:41) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:22) at parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:108) at parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:28) at parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:158) ... ... Here's the full stack trace: using local[*] for master 06:05:55,675 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set 06:05:55,683 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 06:05:55,694 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT] 06:05:55,721 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 06:05:55,768 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO
[jira] [Created] (SPARK-6554) Cannot use partition columns in where clause
Jon Chase created SPARK-6554: Summary: Cannot use partition columns in where clause Key: SPARK-6554 URL: https://issues.apache.org/jira/browse/SPARK-6554 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Jon Chase I'm having trouble referencing partition columns in my queries with Parquet. In the following example, 'probeTypeId' is a partition column. For example, the directory structure looks like this: /mydata /probeTypeId=1 ...files... /probeTypeId=2 ...files... I see the column when I reference load a DF using the /mydata directory and call df.printSchema(): ... |-- probeTypeId: integer (nullable = true) ... Parquet is also aware of the column: optional int32 probeTypeId; And this works fine: sqlContext.sql(select probeTypeId from df limit 1); ...as does df.show() - it shows the correct values for the partition column. However, when I try to use a partition column in a where clause, I get an exception stating that the column was not found in the schema: sqlContext.sql(select probeTypeId from df where probeTypeId = 1 limit 1); ... ... org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Column [probeTypeId] was not found in schema! at parquet.Preconditions.checkArgument(Preconditions.java:47) at parquet.filter2.predicate.SchemaCompatibilityValidator.getColumnDescriptor(SchemaCompatibilityValidator.java:172) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:160) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:142) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:76) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:41) at parquet.filter2.predicate.Operators$Eq.accept(Operators.java:162) at parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:46) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:41) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:22) at parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:108) at parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:28) at parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:158) ... ... Here's the full stack trace: using local[*] for master 06:05:55,675 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set 06:05:55,683 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 06:05:55,694 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT] 06:05:55,721 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 06:05:55,768 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO 06:05:55,768 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT] 06:05:55,769 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration. 06:05:55,770 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@6aaceffd - Registering current configuration as safe fallback point INFO org.apache.spark.SparkContext Running Spark version 1.3.0 WARN o.a.hadoop.util.NativeCodeLoader Unable to load native-hadoop library for your platform... using builtin-java classes where applicable INFO org.apache.spark.SecurityManager Changing view acls to: jon INFO org.apache.spark.SecurityManager Changing modify acls to: jon INFO org.apache.spark.SecurityManager SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jon); users with modify permissions: Set(jon) INFO akka.event.slf4j.Slf4jLogger Slf4jLogger started INFO Remoting Starting remoting INFO Remoting Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.134:62493] INFO org.apache.spark.util.Utils Successfully started service 'sparkDriver' on port 62493. INFO org.apache.spark.SparkEnv Registering MapOutputTracker INFO org.apache.spark.SparkEnv Registering BlockManagerMaster INFO o.a.spark.storage.DiskBlockManager Created local directory at
[jira] [Commented] (SPARK-6554) Cannot use partition columns in where clause when Parquet filter push-down is enabled
[ https://issues.apache.org/jira/browse/SPARK-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381831#comment-14381831 ] Jon Chase commented on SPARK-6554: -- spark.sql.parquet.filterPushdown was the problem. Leaving it set to false works around the problem for now. Thanks for jumping on this. Cannot use partition columns in where clause when Parquet filter push-down is enabled - Key: SPARK-6554 URL: https://issues.apache.org/jira/browse/SPARK-6554 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.3.0 Reporter: Jon Chase Assignee: Cheng Lian Priority: Critical I'm having trouble referencing partition columns in my queries with Parquet. In the following example, 'probeTypeId' is a partition column. For example, the directory structure looks like this: {noformat} /mydata /probeTypeId=1 ...files... /probeTypeId=2 ...files... {noformat} I see the column when I reference load a DF using the /mydata directory and call df.printSchema(): {noformat} |-- probeTypeId: integer (nullable = true) {noformat} Parquet is also aware of the column: {noformat} optional int32 probeTypeId; {noformat} And this works fine: {code} sqlContext.sql(select probeTypeId from df limit 1); {code} ...as does {{df.show()}} - it shows the correct values for the partition column. However, when I try to use a partition column in a where clause, I get an exception stating that the column was not found in the schema: {noformat} sqlContext.sql(select probeTypeId from df where probeTypeId = 1 limit 1); ... ... org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Column [probeTypeId] was not found in schema! at parquet.Preconditions.checkArgument(Preconditions.java:47) at parquet.filter2.predicate.SchemaCompatibilityValidator.getColumnDescriptor(SchemaCompatibilityValidator.java:172) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:160) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:142) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:76) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:41) at parquet.filter2.predicate.Operators$Eq.accept(Operators.java:162) at parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:46) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:41) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:22) at parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:108) at parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:28) at parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:158) ... ... {noformat} Here's the full stack trace: {noformat} using local[*] for master 06:05:55,675 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set 06:05:55,683 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender] 06:05:55,694 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT] 06:05:55,721 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property 06:05:55,768 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO 06:05:55,768 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT] 06:05:55,769 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration. 06:05:55,770 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@6aaceffd - Registering current configuration as safe fallback point INFO org.apache.spark.SparkContext Running Spark version 1.3.0 WARN o.a.hadoop.util.NativeCodeLoader Unable to load native-hadoop library for your platform... using builtin-java classes where applicable INFO org.apache.spark.SecurityManager Changing view acls to: jon INFO org.apache.spark.SecurityManager Changing modify acls to: jon INFO org.apache.spark.SecurityManager SecurityManager: