[jira] [Commented] (SPARK-10251) Some internal spark classes are not registered with kryo

2015-09-10 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738390#comment-14738390
 ] 

Marius Soutier commented on SPARK-10251:


Any chance for a backport to 1.4.2?

> Some internal spark classes are not registered with kryo
> 
>
> Key: SPARK-10251
> URL: https://issues.apache.org/jira/browse/SPARK-10251
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.4.1
>Reporter: Soren Macbeth
>Assignee: Ram Sriharsha
> Fix For: 1.6.0
>
>
> When running a job using kryo serialization and setting 
> `spark.kryo.registrationRequired=true` some internal classes are not 
> registered, causing the job to die. This is still a problem when this setting 
> is false (which is the default) because it makes the space required to store 
> serialized objects in memory or disk much much more expensive in terms of 
> runtime and storage space.
> {code}
> 15/08/25 20:28:21 WARN spark.scheduler.TaskSetManager: Lost task 0.0 in stage 
> 0.0 (TID 0, a.b.c.d): java.lang.IllegalArgumentException: Class is not 
> registered: scala.Tuple2[]
> Note: To register this class use: kryo.register(scala.Tuple2[].class);
> at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:442)
> at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:79)
> at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:472)
> at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:565)
> at 
> org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:250)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:236)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7600) Stopping Streaming Context (sometimes) crashes master

2015-05-21 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553999#comment-14553999
 ] 

Marius Soutier commented on SPARK-7600:
---

Yes it is, that might be the problem indeed.


 Stopping Streaming Context (sometimes) crashes master
 -

 Key: SPARK-7600
 URL: https://issues.apache.org/jira/browse/SPARK-7600
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.1
Reporter: Marius Soutier

 In my streaming job (that uses actorStreams) I'm stopping the SparkStreaming 
 context via ssc.stop(stopSparkContext = true, stopGracefully = true). 
 Sometimes this leads to the Spark master being in a permanent error state 
 that just displays an error page instead of the UI.
 The following is being logged when trying to access the master UI:
 15/05/13 15:57:15 WARN jetty.servlet.ServletHandler: /
 java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
   at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
   at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
   at 
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
   at scala.concurrent.Await$.result(package.scala:107)
   at org.apache.spark.deploy.master.ui.MasterPage.render(MasterPage.scala:47)
   at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:79)
   at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:79)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:69)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.spark-project.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.spark-project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.spark-project.jetty.server.Server.handle(Server.java:370)
   at 
 org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.spark-project.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at 
 org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.spark-project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.spark-project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-7600) Stopping Streaming Context (sometimes) crashes master

2015-05-21 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553999#comment-14553999
 ] 

Marius Soutier edited comment on SPARK-7600 at 5/21/15 9:52 AM:


Yes it is, that might be the problem indeed. However this particular problem 
happens when stopping the job.



was (Author: msoutier):
Yes it is, that might be the problem indeed.


 Stopping Streaming Context (sometimes) crashes master
 -

 Key: SPARK-7600
 URL: https://issues.apache.org/jira/browse/SPARK-7600
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.1
Reporter: Marius Soutier

 In my streaming job (that uses actorStreams) I'm stopping the SparkStreaming 
 context via ssc.stop(stopSparkContext = true, stopGracefully = true). 
 Sometimes this leads to the Spark master being in a permanent error state 
 that just displays an error page instead of the UI.
 The following is being logged when trying to access the master UI:
 15/05/13 15:57:15 WARN jetty.servlet.ServletHandler: /
 java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
   at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
   at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
   at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
   at 
 scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
   at scala.concurrent.Await$.result(package.scala:107)
   at org.apache.spark.deploy.master.ui.MasterPage.render(MasterPage.scala:47)
   at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:79)
   at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:79)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:69)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.spark-project.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.spark-project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.spark-project.jetty.server.Server.handle(Server.java:370)
   at 
 org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.spark-project.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at 
 org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.spark-project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.spark-project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7600) Stopping Streaming Context (sometimes) crashes master

2015-05-13 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-7600:
-

 Summary: Stopping Streaming Context (sometimes) crashes master
 Key: SPARK-7600
 URL: https://issues.apache.org/jira/browse/SPARK-7600
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.3.1
Reporter: Marius Soutier


In my streaming job (that uses actorStreams) I'm stopping the SparkStreaming 
context via ssc.stop(stopSparkContext = true, stopGracefully = true). Sometimes 
this leads to the Spark master being in a permanent error state that just 
displays an error page instead of the UI.

The following is being logged when trying to access the master UI:

15/05/13 15:57:15 WARN jetty.servlet.ServletHandler: /
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
  at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
  at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
  at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
  at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
  at scala.concurrent.Await$.result(package.scala:107)
  at org.apache.spark.deploy.master.ui.MasterPage.render(MasterPage.scala:47)
  at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:79)
  at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:79)
  at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:69)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
  at 
org.spark-project.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
  at 
org.spark-project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
  at 
org.spark-project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
  at 
org.spark-project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
  at 
org.spark-project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
  at 
org.spark-project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
  at 
org.spark-project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
  at 
org.spark-project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
  at org.spark-project.jetty.server.Server.handle(Server.java:370)
  at 
org.spark-project.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
  at 
org.spark-project.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
  at 
org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
  at org.spark-project.jetty.http.HttpParser.parseNext(HttpParser.java:644)
  at org.spark-project.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
  at 
org.spark-project.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
  at 
org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
  at 
org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
  at 
org.spark-project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
  at 
org.spark-project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-05-13 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541951#comment-14541951
 ] 

Marius Soutier commented on SPARK-6613:
---

It's still happening with 1.3.1.

 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2, 1.3.1
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)



--
This message was 

[jira] [Updated] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-05-13 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-6613:
--
Affects Version/s: 1.3.1

 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2, 1.3.1
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SPARK-3928) Support wildcard matches on Parquet files

2015-05-07 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532663#comment-14532663
 ] 

Marius Soutier commented on SPARK-3928:
---

DataFrames now expect varagrs, i.e. 
df.parquetFile(/path/to/file/1,/path/to/file/2).


 Support wildcard matches on Parquet files
 -

 Key: SPARK-3928
 URL: https://issues.apache.org/jira/browse/SPARK-3928
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL
Reporter: Nicholas Chammas
Priority: Minor
 Fix For: 1.3.0


 {{SparkContext.textFile()}} supports patterns like {{part-*}} and 
 {{2014-\?\?-\?\?}}. 
 It would be nice if {{SparkContext.parquetFile()}} did the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3928) Support wildcard matches on Parquet files

2015-05-07 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532704#comment-14532704
 ] 

Marius Soutier commented on SPARK-3928:
---

Wildcards were never supported and it seems they don't intend to change that. :(

 Support wildcard matches on Parquet files
 -

 Key: SPARK-3928
 URL: https://issues.apache.org/jira/browse/SPARK-3928
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL
Reporter: Nicholas Chammas
Priority: Minor
 Fix For: 1.3.0


 {{SparkContext.textFile()}} supports patterns like {{part-*}} and 
 {{2014-\?\?-\?\?}}. 
 It would be nice if {{SparkContext.parquetFile()}} did the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-3928) Support wildcard matches on Parquet files

2015-05-07 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532663#comment-14532663
 ] 

Marius Soutier edited comment on SPARK-3928 at 5/7/15 1:54 PM:
---

DataFrames now expect varargs, i.e. 
df.parquetFile(/path/to/file/1,/path/to/file/2).



was (Author: msoutier):
DataFrames now expect varagrs, i.e. 
df.parquetFile(/path/to/file/1,/path/to/file/2).


 Support wildcard matches on Parquet files
 -

 Key: SPARK-3928
 URL: https://issues.apache.org/jira/browse/SPARK-3928
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL
Reporter: Nicholas Chammas
Priority: Minor
 Fix For: 1.3.0


 {{SparkContext.textFile()}} supports patterns like {{part-*}} and 
 {{2014-\?\?-\?\?}}. 
 It would be nice if {{SparkContext.parquetFile()}} did the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3928) Support wildcard matches on Parquet files

2015-05-07 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532969#comment-14532969
 ] 

Marius Soutier commented on SPARK-3928:
---

Because parquetFile now takes a varargs parameter which in turn is combined to 
a single path using mkString(,). This works just as before. The PR you link 
to still uses the old method with a single String parameter. It probably got 
lost in translation.



 Support wildcard matches on Parquet files
 -

 Key: SPARK-3928
 URL: https://issues.apache.org/jira/browse/SPARK-3928
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL
Reporter: Nicholas Chammas
Priority: Minor
 Fix For: 1.3.0


 {{SparkContext.textFile()}} supports patterns like {{part-*}} and 
 {{2014-\?\?-\?\?}}. 
 It would be nice if {{SparkContext.parquetFile()}} did the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7167) Receivers are not distributed efficiently

2015-04-27 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-7167:
-

 Summary: Receivers are not distributed efficiently
 Key: SPARK-7167
 URL: https://issues.apache.org/jira/browse/SPARK-7167
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.2, 1.2.1
Reporter: Marius Soutier


Bug report: I'm seeing an issue where after starting a streaming application 
from a checkpoint, the network receivers are distributed such that not all 
nodes are used.

For example, I have five nodes:
node0 - 1 receiver
node1 - 2 receivers
node2 - 0 receivers
node3 - 2 receivers
node4 - 0 receivers

This slows down the job, waiting batches pile up, and I have to kill and 
restart it, hoping that next time it will be distributed in a sensible fashion.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7167) Receivers are not distributed efficiently when starting from checkpoint

2015-04-27 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-7167:
--
Summary: Receivers are not distributed efficiently when starting from 
checkpoint  (was: Receivers are not distributed efficiently)

 Receivers are not distributed efficiently when starting from checkpoint
 ---

 Key: SPARK-7167
 URL: https://issues.apache.org/jira/browse/SPARK-7167
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2
Reporter: Marius Soutier

 Bug report: I'm seeing an issue where after starting a streaming application 
 from a checkpoint, the network receivers are distributed such that not all 
 nodes are used.
 For example, I have five nodes:
 node0 - 1 receiver
 node1 - 2 receivers
 node2 - 0 receivers
 node3 - 2 receivers
 node4 - 0 receivers
 This slows down the job, waiting batches pile up, and I have to kill and 
 restart it, hoping that next time it will be distributed in a sensible 
 fashion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-04-27 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-6613:
--
Affects Version/s: 1.2.2

 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SPARK-7167) Receivers are not distributed efficiently when starting from checkpoint

2015-04-27 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513976#comment-14513976
 ] 

Marius Soutier edited comment on SPARK-7167 at 4/27/15 12:09 PM:
-

Maybe the slowdown is only incidental, though it's odd at a batch interval of 1 
minute and 40-50 records per interval.

In my case I have an actor system running on each worker node that receives 
data and forwards it to a registered actor receiver (ssc.actorStream(...)) to 
this results in additional network traffic, but that should not be a problem at 
10Gbit. (I'm also aware that actorStream is not really a production-ready 
feature.)

But in any case, from the documentation:

For example, a single Kafka input DStream receiving two topics of data can be 
split into two Kafka input streams, each receiving only one topic. This would 
run two receivers on two workers [...]

So receivers should be distributed equally on the cluster, and this appears to 
be a bug.

I also noticed the receivers get redistributed all the time.



was (Author: msoutier):
Maybe the slowdown is only incidental, though it's odd at a batch interval of 1 
minute and 40-50 records per interval.

In my case I have an actor system running on each worker node that receives 
data and forwards it to a registered actor receiver (ssc.actorStream(...)) to 
this results in additional network traffic, but that should not be a problem at 
10Gbit. (I'm also aware that actorStream is not really a production-ready 
feature.)

But in any case, from the documentation:

For example, a single Kafka input DStream receiving two topics of data can be 
split into two Kafka input streams, each receiving only one topic. This would 
run two receivers on two workers [...]

So receivers should be distributed equally on the cluster, and this appears to 
be a bug.


 Receivers are not distributed efficiently when starting from checkpoint
 ---

 Key: SPARK-7167
 URL: https://issues.apache.org/jira/browse/SPARK-7167
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2
Reporter: Marius Soutier
Priority: Minor

 Bug report: I'm seeing an issue where after starting a streaming application 
 from a checkpoint, the network receivers are distributed such that not all 
 nodes are used.
 For example, I have five nodes:
 node0 - 1 receiver
 node1 - 2 receivers
 node2 - 0 receivers
 node3 - 2 receivers
 node4 - 0 receivers
 This slows down the job, waiting batches pile up, and I have to kill and 
 restart it, hoping that next time it will be distributed in a sensible 
 fashion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7167) Receivers are not distributed efficiently when starting from checkpoint

2015-04-27 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513976#comment-14513976
 ] 

Marius Soutier commented on SPARK-7167:
---

Maybe the slowdown is only incidental, though it's odd at a batch interval of 1 
minute and 40-50 records per interval.

In my case I have an actor system running on each worker node that receives 
data and forwards it to a registered actor receiver (ssc.actorStream(...)) to 
this results in additional network traffic, but that should not be a problem at 
10Gbit. (I'm also aware that actorStream is not really a production-ready 
feature.)

But in any case, from the documentation:

For example, a single Kafka input DStream receiving two topics of data can be 
split into two Kafka input streams, each receiving only one topic. This would 
run two receivers on two workers [...]

So receivers should be distributed equally on the cluster, and this appears to 
be a bug.


 Receivers are not distributed efficiently when starting from checkpoint
 ---

 Key: SPARK-7167
 URL: https://issues.apache.org/jira/browse/SPARK-7167
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2
Reporter: Marius Soutier
Priority: Minor

 Bug report: I'm seeing an issue where after starting a streaming application 
 from a checkpoint, the network receivers are distributed such that not all 
 nodes are used.
 For example, I have five nodes:
 node0 - 1 receiver
 node1 - 2 receivers
 node2 - 0 receivers
 node3 - 2 receivers
 node4 - 0 receivers
 This slows down the job, waiting batches pile up, and I have to kill and 
 restart it, hoping that next time it will be distributed in a sensible 
 fashion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7167) Receivers are not distributed efficiently when starting from checkpoint

2015-04-27 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-7167:
--
Attachment: Screen Shot 2015-04-27 at 14.10.05.jpg

 Receivers are not distributed efficiently when starting from checkpoint
 ---

 Key: SPARK-7167
 URL: https://issues.apache.org/jira/browse/SPARK-7167
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1, 1.2.2
Reporter: Marius Soutier
Priority: Minor
 Attachments: Screen Shot 2015-04-27 at 14.10.05.jpg


 Bug report: I'm seeing an issue where after starting a streaming application 
 from a checkpoint, the network receivers are distributed such that not all 
 nodes are used.
 For example, I have five nodes:
 node0 - 1 receiver
 node1 - 2 receivers
 node2 - 0 receivers
 node3 - 2 receivers
 node4 - 0 receivers
 This slows down the job, waiting batches pile up, and I have to kill and 
 restart it, hoping that next time it will be distributed in a sensible 
 fashion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7028) Add filterNot to RDD

2015-04-21 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-7028:
--
  Priority: Minor  (was: Major)
Issue Type: Improvement  (was: Bug)

 Add filterNot to RDD
 

 Key: SPARK-7028
 URL: https://issues.apache.org/jira/browse/SPARK-7028
 Project: Spark
  Issue Type: Improvement
Reporter: Marius Soutier
Priority: Minor

 The Scala collection APIs have not only `filter`, but also `filterNot` for 
 convenience and readability. I'd suggest to add the same to RDD.
 I can submit a PR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7028) Add filterNot to RDD

2015-04-21 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-7028:
-

 Summary: Add filterNot to RDD
 Key: SPARK-7028
 URL: https://issues.apache.org/jira/browse/SPARK-7028
 Project: Spark
  Issue Type: Bug
Reporter: Marius Soutier


The Scala collection APIs have not only `filter`, but also `filterNot` for 
convenience and readability. I'd suggest to add the same to RDD.

I can submit a PR.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-04-02 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392234#comment-14392234
 ] 

Marius Soutier commented on SPARK-6613:
---

It's combination of actorStreams and StreamingContext.getOrCreate(). I've 
started to update the actorStream example from spark-examples, but it will take 
some more time to complete it. I'll post the code here.


 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 

[jira] [Commented] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-04-01 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390423#comment-14390423
 ] 

Marius Soutier commented on SPARK-6613:
---

Bug report.

 Starting stream from checkpoint causes Streaming tab to throw error
 ---

 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When continuing my streaming job from a checkpoint, the job runs, but the 
 Streaming tab in the standard UI initially no longer works (browser just 
 shows HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
 sometimes it stays in this state permanently.
 Stacktrace:
 WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
 java.util.NoSuchElementException: key not found: 0
   at scala.collection.MapLike$class.default(MapLike.scala:228)
   at scala.collection.AbstractMap.default(Map.scala:58)
   at scala.collection.MapLike$class.apply(MapLike.scala:141)
   at scala.collection.AbstractMap.apply(Map.scala:58)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
   at scala.collection.immutable.Range.foreach(Range.scala:141)
   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
   at scala.Option.map(Option.scala:145)
   at 
 org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
   at 
 org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
   at 
 org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
   at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
   at 
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:370)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
   at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA

[jira] [Created] (SPARK-6648) Reading Parquet files with different sub-files doesn't work

2015-04-01 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-6648:
-

 Summary: Reading Parquet files with different sub-files doesn't 
work
 Key: SPARK-6648
 URL: https://issues.apache.org/jira/browse/SPARK-6648
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.1
Reporter: Marius Soutier


When reading from multiple parquet files (via 
sqlContext.parquetFile(/path/1.parquet,/path/2.parquet), if the parquet files 
were created using a different coalesce, the reading fails with:

ERROR c.w.r.websocket.ParquetReader  efault-dispatcher-63 : Failed reading 
parquet file
java.lang.IllegalArgumentException: Could not find Parquet metadata at path 
path
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]

at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at scala.Option.getOrElse(Option.scala:120) 
~[org.scala-lang.scala-library-2.10.4.jar:na]
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readMetaData(ParquetTypes.scala:458)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readSchemaFromFile(ParquetTypes.scala:477)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at 
org.apache.spark.sql.parquet.ParquetRelation.init(ParquetRelation.scala:65) 
~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at org.apache.spark.sql.SQLContext.parquetFile(SQLContext.scala:165) 
~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]

I haven't tested with Spark 1.3 yet but will report back after upgrading to 
1.3.1 (as soon as it's released).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6648) Reading Parquet files with different sub-files doesn't work

2015-04-01 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-6648:
--
Description: 
When reading from multiple parquet files (via 
sqlContext.parquetFile(/path/1.parquet,/path/2.parquet), and one of the parquet 
files is being overwritten using a different coalesce (e.g. one only contains 
part-r-1.parquet, the other also part-r-2.parquet, part-r-3.parquet), the 
reading fails with:

ERROR c.w.r.websocket.ParquetReader  efault-dispatcher-63 : Failed reading 
parquet file
java.lang.IllegalArgumentException: Could not find Parquet metadata at path 
path
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]

at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at scala.Option.getOrElse(Option.scala:120) 
~[org.scala-lang.scala-library-2.10.4.jar:na]
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readMetaData(ParquetTypes.scala:458)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readSchemaFromFile(ParquetTypes.scala:477)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at 
org.apache.spark.sql.parquet.ParquetRelation.init(ParquetRelation.scala:65) 
~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at org.apache.spark.sql.SQLContext.parquetFile(SQLContext.scala:165) 
~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]

I haven't tested with Spark 1.3 yet but will report back after upgrading to 
1.3.1 (as soon as it's released).


  was:
When reading from multiple parquet files (via 
sqlContext.parquetFile(/path/1.parquet,/path/2.parquet), if the parquet files 
were created using a different coalesce (e.g. one only contains 
part-r-1.parquet, the other also part-r-2.parquet, part-r-3.parquet), the 
reading fails with:

ERROR c.w.r.websocket.ParquetReader  efault-dispatcher-63 : Failed reading 
parquet file
java.lang.IllegalArgumentException: Could not find Parquet metadata at path 
path
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]

at 
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at scala.Option.getOrElse(Option.scala:120) 
~[org.scala-lang.scala-library-2.10.4.jar:na]
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readMetaData(ParquetTypes.scala:458)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at 
org.apache.spark.sql.parquet.ParquetTypesConverter$.readSchemaFromFile(ParquetTypes.scala:477)
 ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at 
org.apache.spark.sql.parquet.ParquetRelation.init(ParquetRelation.scala:65) 
~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
at org.apache.spark.sql.SQLContext.parquetFile(SQLContext.scala:165) 
~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]

I haven't tested with Spark 1.3 yet but will report back after upgrading to 
1.3.1 (as soon as it's released).



 Reading Parquet files with different sub-files doesn't work
 ---

 Key: SPARK-6648
 URL: https://issues.apache.org/jira/browse/SPARK-6648
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When reading from multiple parquet files (via 
 sqlContext.parquetFile(/path/1.parquet,/path/2.parquet), and one of the 
 parquet files is being overwritten using a different coalesce (e.g. one only 
 contains part-r-1.parquet, the other also part-r-2.parquet, 
 part-r-3.parquet), the reading fails with:
 ERROR c.w.r.websocket.ParquetReader  efault-dispatcher-63 : Failed reading 
 parquet file
 java.lang.IllegalArgumentException: Could not find Parquet metadata at path 
 path
 at 
 org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
  ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
   at 
 org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$readMetaData$4.apply(ParquetTypes.scala:459)
  ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
   at scala.Option.getOrElse(Option.scala:120) 
 ~[org.scala-lang.scala-library-2.10.4.jar:na]
   at 
 org.apache.spark.sql.parquet.ParquetTypesConverter$.readMetaData(ParquetTypes.scala:458)
  ~[org.apache.spark.spark-sql_2.10-1.2.1.jar:1.2.1]
   at 
 org.apache.spark.sql.parquet.ParquetTypesConverter$.readSchemaFromFile(ParquetTypes.scala:477)
  

[jira] [Updated] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-03-31 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-6613:
--
Description: 
When continuing my streaming job from a checkpoint, the job runs, but the 
Streaming tab in the standard UI initially no longer works (browser just shows 
HTTP ERROR: 500). Sometimes  it gets back to normal after a while, and 
sometimes it stays in this state permanently.

Stacktrace:

WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
java.util.NoSuchElementException: key not found: 0
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Range.foreach(Range.scala:141)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
at scala.Option.map(Option.scala:145)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
at 
org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
at 
org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)


  was:
When continuing my streaming job from a checkpoint, the job runs, but the 
Streaming tab in the standard UI initially no longer works (browser just shows 
HTTP ERROR: 500). After a while, it gets back to normal, at least most of the 
time (sometimes it doesn't work at all, but that's rare).

Stacktrace:

WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
java.util.NoSuchElementException: key not found: 0
at scala.collection.MapLike$class.default(MapLike.scala:228)

[jira] [Updated] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-03-30 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-6613:
--
Description: 
When continuing my streaming job from a checkpoint, the job runs, but the 
Streaming tab in the standard UI initially no longer works (browser just shows 
HTTP ERROR: 500). After a while, it gets back to normal, at least most of the 
time (sometimes it doesn't work at all, but that's rare).

Stacktrace:

WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
java.util.NoSuchElementException: key not found: 0
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Range.foreach(Range.scala:141)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
at scala.Option.map(Option.scala:145)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
at 
org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
at 
org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)


  was:
When continuing my streaming job from a checkpoint, the job runs, but the 
Streaming tab in the standard UI no longer works (browser just shows HTTP 
ERROR: 500).

Stacktrace:

WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
java.util.NoSuchElementException: key not found: 0
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at 

[jira] [Created] (SPARK-6613) Starting stream from checkpoint causes Streaming tab to throw error

2015-03-30 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-6613:
-

 Summary: Starting stream from checkpoint causes Streaming tab to 
throw error
 Key: SPARK-6613
 URL: https://issues.apache.org/jira/browse/SPARK-6613
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Marius Soutier


When continuing my streaming job from a checkpoint, it works, but the Streaming 
tab in the standard UI no longer works (browser just shows HTTP ERROR: 500).

Stacktrace:

WARN org.eclipse.jetty.servlet.ServletHandler: /streaming/
java.util.NoSuchElementException: key not found: 0
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:151)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1$$anonfun$apply$5.apply(StreamingJobProgressListener.scala:150)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Range.foreach(Range.scala:141)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:150)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener$$anonfun$lastReceivedBatchRecords$1.apply(StreamingJobProgressListener.scala:149)
at scala.Option.map(Option.scala:145)
at 
org.apache.spark.streaming.ui.StreamingJobProgressListener.lastReceivedBatchRecords(StreamingJobProgressListener.scala:149)
at 
org.apache.spark.streaming.ui.StreamingPage.generateReceiverStats(StreamingPage.scala:82)
at 
org.apache.spark.streaming.ui.StreamingPage.render(StreamingPage.scala:43)
at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
at org.apache.spark.ui.WebUI$$anonfun$attachPage$1.apply(WebUI.scala:68)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:68)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6504) Cannot read Parquet files generated from different versions at once

2015-03-26 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382073#comment-14382073
 ] 

Marius Soutier commented on SPARK-6504:
---

Not easily, but 1.3.1 is supposed to be released soon, right?

 Cannot read Parquet files generated from different versions at once
 ---

 Key: SPARK-6504
 URL: https://issues.apache.org/jira/browse/SPARK-6504
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When trying to read Parquet files generated by Spark 1.1.1 and 1.2.1 at the 
 same time via 
 `sqlContext.parquetFile(fileFrom1.1.parqut,fileFrom1.2.parquet)` an 
 exception occurs:
 could not merge metadata: key org.apache.spark.sql.parquet.row.metadata has 
 conflicting values: 
 [{type:struct,fields:[{name:date,type:string,nullable:true,metadata:{}},{name:account,type:string,nullable:true,metadata:{}},{name:impressions,type:long,nullable:false,metadata:{}},{name:cost,type:double,nullable:false,metadata:{}},{name:clicks,type:long,nullable:false,metadata:{}},{name:conversions,type:long,nullable:false,metadata:{}},{name:orderValue,type:double,nullable:false,metadata:{}}]},
  StructType(List(StructField(date,StringType,true), 
 StructField(account,StringType,true), 
 StructField(impressions,LongType,false), StructField(cost,DoubleType,false), 
 StructField(clicks,LongType,false), StructField(conversions,LongType,false), 
 StructField(orderValue,DoubleType,false)))]
 The Schema is exactly equal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6504) Cannot read Parquet files generated from different versions at once

2015-03-26 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382027#comment-14382027
 ] 

Marius Soutier commented on SPARK-6504:
---

No, as far as I understand, Spark 1.3 cannot read Parquets created with 1.1.x 
at all.

 Cannot read Parquet files generated from different versions at once
 ---

 Key: SPARK-6504
 URL: https://issues.apache.org/jira/browse/SPARK-6504
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.2.1
Reporter: Marius Soutier

 When trying to read Parquet files generated by Spark 1.1.1 and 1.2.1 at the 
 same time via 
 `sqlContext.parquetFile(fileFrom1.1.parqut,fileFrom1.2.parquet)` an 
 exception occurs:
 could not merge metadata: key org.apache.spark.sql.parquet.row.metadata has 
 conflicting values: 
 [{type:struct,fields:[{name:date,type:string,nullable:true,metadata:{}},{name:account,type:string,nullable:true,metadata:{}},{name:impressions,type:long,nullable:false,metadata:{}},{name:cost,type:double,nullable:false,metadata:{}},{name:clicks,type:long,nullable:false,metadata:{}},{name:conversions,type:long,nullable:false,metadata:{}},{name:orderValue,type:double,nullable:false,metadata:{}}]},
  StructType(List(StructField(date,StringType,true), 
 StructField(account,StringType,true), 
 StructField(impressions,LongType,false), StructField(cost,DoubleType,false), 
 StructField(clicks,LongType,false), StructField(conversions,LongType,false), 
 StructField(orderValue,DoubleType,false)))]
 The Schema is exactly equal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6504) Cannot read Parquet files generated from different versions at once

2015-03-24 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-6504:
-

 Summary: Cannot read Parquet files generated from different 
versions at once
 Key: SPARK-6504
 URL: https://issues.apache.org/jira/browse/SPARK-6504
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Marius Soutier


When trying to read Parquet files generated by Spark 1.1.1 and 1.2.1 at the 
same time via 
`sqlContext.parquetFile(fileFrom1.1.parqut,fileFrom1.2.parquet)` an exception 
occurs:

could not merge metadata: key org.apache.spark.sql.parquet.row.metadata has 
conflicting values: 
[{type:struct,fields:[{name:date,type:string,nullable:true,metadata:{}},{name:account,type:string,nullable:true,metadata:{}},{name:impressions,type:long,nullable:false,metadata:{}},{name:cost,type:double,nullable:false,metadata:{}},{name:clicks,type:long,nullable:false,metadata:{}},{name:conversions,type:long,nullable:false,metadata:{}},{name:orderValue,type:double,nullable:false,metadata:{}}]},
 StructType(List(StructField(date,StringType,true), 
StructField(account,StringType,true), StructField(impressions,LongType,false), 
StructField(cost,DoubleType,false), StructField(clicks,LongType,false), 
StructField(conversions,LongType,false), 
StructField(orderValue,DoubleType,false)))]

The Schema is exactly equal.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-17 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364713#comment-14364713
 ] 

Marius Soutier edited comment on SPARK-6304 at 3/17/15 7:36 AM:


Got it, thanks. In my tests it was never set automatically, so this must be set 
at some later point.


was (Author: msoutier):
Got it, thanks.

 Checkpointing doesn't retain driver port
 

 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 In a check-pointed Streaming application running on a fixed driver port, the 
 setting spark.driver.port is not loaded when recovering from a checkpoint.
 (The driver is then started on a random port.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-17 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364713#comment-14364713
 ] 

Marius Soutier commented on SPARK-6304:
---

Got it, thanks.

 Checkpointing doesn't retain driver port
 

 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 In a check-pointed Streaming application running on a fixed driver port, the 
 setting spark.driver.port is not loaded when recovering from a checkpoint.
 (The driver is then started on a random port.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-16 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363047#comment-14363047
 ] 

Marius Soutier commented on SPARK-6304:
---

Yeah but if the user doesn't set the port, why remove it? When Spark 
deserializes the checkpoint, the port shouldn't be set by default, right?


 Checkpointing doesn't retain driver port
 

 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 In a check-pointed Streaming application running on a fixed driver port, the 
 setting spark.driver.port is not loaded when recovering from a checkpoint.
 (The driver is then started on a random port.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-16 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362928#comment-14362928
 ] 

Marius Soutier commented on SPARK-6304:
---

I'm just reporting the bug. As you said, the code explicitly removes 
spark.driver.host and spark.driver.port when recovering from a checkpoint, 
so I first would like to understand why that is.




 Checkpointing doesn't retain driver port
 

 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 In a check-pointed Streaming application running on a fixed driver port, the 
 setting spark.driver.port is not loaded when recovering from a checkpoint.
 (The driver is then started on a random port.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-16 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362918#comment-14362918
 ] 

Marius Soutier commented on SPARK-6304:
---

Simple, I'm using `actorStream` and want to send data to it via remoting. For 
that I need to have a fixed port to send data to.

As a workaround I'm now starting a second ActorSystem, but it seems to have 
issues communicating with Spark's ActorSystem.


 Checkpointing doesn't retain driver port
 

 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 In a check-pointed Streaming application running on a fixed driver port, the 
 setting spark.driver.port is not loaded when recovering from a checkpoint.
 (The driver is then started on a random port.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-12 Thread Marius Soutier (JIRA)
Marius Soutier created SPARK-6304:
-

 Summary: Checkpointing doesn't retain driver port
 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier


In a check-pointed Streaming application running on a fixed driver port, the 
setting spark.driver.port is not loaded when recovering from checkpoint.

(The driver is then started on a random port.)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6304) Checkpointing doesn't retain driver port

2015-03-12 Thread Marius Soutier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marius Soutier updated SPARK-6304:
--
Description: 
In a check-pointed Streaming application running on a fixed driver port, the 
setting spark.driver.port is not loaded when recovering from a checkpoint.

(The driver is then started on a random port.)


  was:
In a check-pointed Streaming application running on a fixed driver port, the 
setting spark.driver.port is not loaded when recovering from checkpoint.

(The driver is then started on a random port.)



 Checkpointing doesn't retain driver port
 

 Key: SPARK-6304
 URL: https://issues.apache.org/jira/browse/SPARK-6304
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.1
Reporter: Marius Soutier

 In a check-pointed Streaming application running on a fixed driver port, the 
 setting spark.driver.port is not loaded when recovering from a checkpoint.
 (The driver is then started on a random port.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3928) Support wildcard matches on Parquet files

2014-10-23 Thread Marius Soutier (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181185#comment-14181185
 ] 

Marius Soutier commented on SPARK-3928:
---

This would be more than nice. Currently, `parquetFile()` supports 
comma-separated input, but this fails when one of those inputs is not 
available. A wildcard should solve that and make the API more consistent with 
other input methods.


 Support wildcard matches on Parquet files
 -

 Key: SPARK-3928
 URL: https://issues.apache.org/jira/browse/SPARK-3928
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL
Reporter: Nicholas Chammas
Priority: Minor

 {{SparkContext.textFile()}} supports patterns like {{part-*}} and 
 {{2014-\?\?-\?\?}}. 
 It would be nice if {{SparkContext.parquetFile()}} did the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org