[jira] [Commented] (FLINK-8164) JobManager's archiving does not work on S3

2017-12-04 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276844#comment-16276844
 ] 

Chesnay Schepler commented on FLINK-8164:
-

https://github.com/apache/flink/blob/release-1.3/flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/MemoryArchivist.scala#L294

You would have to replace
{code}
if (!FileSystem.isFlinkSupportedScheme(archivePathUri.getScheme)) {
  // skip verification checks for non-flink supported filesystem
  // this is because the required filesystem classes may not be available 
to the flink client
  throw new IllegalArgumentException("No file system found with scheme " + 
scheme
+ ", referenced in file URI '" + archivePathUri.toString + "'.")
}
{code}
with
{code}
try {
  FileSystem.get(archivePathUri)
}
catch {
  case e: Exception =>
 throw new IllegalArgumentException(s"No file system found for URI 
'${archivePathUri}'.")
 }
{code}

> JobManager's archiving does not work on S3
> --
>
> Key: FLINK-8164
> URL: https://issues.apache.org/jira/browse/FLINK-8164
> Project: Flink
>  Issue Type: Bug
>  Components: History Server, JobManager
>Affects Versions: 1.3.2
>Reporter: Cristian
>
> I'm trying to configure JobManager's archiving mechanism 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/historyserver.html)
>  to use S3 but I'm getting this:
> {code}
> 2017-11-28 19:11:09,751 WARN  
> org.apache.flink.runtime.jobmanager.MemoryArchivist   - Failed to 
> create Path for Some(s3a://bucket/completed-jobs). Job will not be archived.
> java.lang.IllegalArgumentException: No file system found with scheme s3, 
> referenced in file URI 's3://bucket/completed-jobs'.
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.validateAndNormalizeUri(MemoryArchivist.scala:297)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.org$apache$flink$runtime$jobmanager$MemoryArchivist$$archiveJsonFiles(MemoryArchivist.scala:201)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist$$anonfun$handleMessage$1.applyOrElse(MemoryArchivist.scala:107)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.aroundReceive(MemoryArchivist.scala:65)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
> Which is very weird since I'm able to write to S3 from within the job itself. 
> I have also tried using s3a instead to no avail.
> This happens running Flink v1.3.2 on EMR.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8164) JobManager's archiving does not work on S3

2017-12-04 Thread Cristian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276801#comment-16276801
 ] 

Cristian commented on FLINK-8164:
-

Thanks, Chesnay!

If you have time... can you point me to where in the code this lives. I wasn't 
able to find it.

> JobManager's archiving does not work on S3
> --
>
> Key: FLINK-8164
> URL: https://issues.apache.org/jira/browse/FLINK-8164
> Project: Flink
>  Issue Type: Bug
>  Components: History Server, JobManager
>Affects Versions: 1.3.2
>Reporter: Cristian
>
> I'm trying to configure JobManager's archiving mechanism 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/historyserver.html)
>  to use S3 but I'm getting this:
> {code}
> 2017-11-28 19:11:09,751 WARN  
> org.apache.flink.runtime.jobmanager.MemoryArchivist   - Failed to 
> create Path for Some(s3a://bucket/completed-jobs). Job will not be archived.
> java.lang.IllegalArgumentException: No file system found with scheme s3, 
> referenced in file URI 's3://bucket/completed-jobs'.
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.validateAndNormalizeUri(MemoryArchivist.scala:297)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.org$apache$flink$runtime$jobmanager$MemoryArchivist$$archiveJsonFiles(MemoryArchivist.scala:201)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist$$anonfun$handleMessage$1.applyOrElse(MemoryArchivist.scala:107)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.aroundReceive(MemoryArchivist.scala:65)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
> Which is very weird since I'm able to write to S3 from within the job itself. 
> I have also tried using s3a instead to no avail.
> This happens running Flink v1.3.2 on EMR.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8164) JobManager's archiving does not work on S3

2017-12-04 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276651#comment-16276651
 ] 

Chesnay Schepler commented on FLINK-8164:
-

This is caused by a too restrictive check in the MemoryArchivist. This check 
limits the supported file-system schemes to "file", "hdfs" and "maprfs". I 
can't think of a quick workaround, apart from writing to the local file-system 
and uploading these files to s3 manually.

This check was for 1.4.

> JobManager's archiving does not work on S3
> --
>
> Key: FLINK-8164
> URL: https://issues.apache.org/jira/browse/FLINK-8164
> Project: Flink
>  Issue Type: Bug
>  Components: History Server, JobManager
>Affects Versions: 1.3.2
>Reporter: Cristian
>
> I'm trying to configure JobManager's archiving mechanism 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/historyserver.html)
>  to use S3 but I'm getting this:
> {code}
> 2017-11-28 19:11:09,751 WARN  
> org.apache.flink.runtime.jobmanager.MemoryArchivist   - Failed to 
> create Path for Some(s3a://bucket/completed-jobs). Job will not be archived.
> java.lang.IllegalArgumentException: No file system found with scheme s3, 
> referenced in file URI 's3://bucket/completed-jobs'.
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.validateAndNormalizeUri(MemoryArchivist.scala:297)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.org$apache$flink$runtime$jobmanager$MemoryArchivist$$archiveJsonFiles(MemoryArchivist.scala:201)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist$$anonfun$handleMessage$1.applyOrElse(MemoryArchivist.scala:107)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
>   at 
> org.apache.flink.runtime.jobmanager.MemoryArchivist.aroundReceive(MemoryArchivist.scala:65)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
> Which is very weird since I'm able to write to S3 from within the job itself. 
> I have also tried using s3a instead to no avail.
> This happens running Flink v1.3.2 on EMR.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)