[jira] [Created] (FLINK-6280) Allow logging with Java flags
Greg Hogan created FLINK-6280: - Summary: Allow logging with Java flags Key: FLINK-6280 URL: https://issues.apache.org/jira/browse/FLINK-6280 Project: Flink Issue Type: Improvement Components: Startup Shell Scripts Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Allow configuring Flink's Java options with the logging prefix and log rotation. For example, this allows the following configurations to write {{.jfr}} and {{.jit}} files alongside the existing {{.log}} and {{.out}} files. {code:language=bash|title=Configuration for Java Flight Recorder} env.java.opts: "-XX:+UnlockCommercialFeatures -XX:+UnlockDiagnosticVMOptions -XX:+FlightRecorder -XX:+DebugNonSafepoints -XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexitpath=${LOG_PREFIX}.jfr" {code} {code:language=bash|title=Configuration for JitWatch} env.java.opts: "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:LogFile=${LOG_PREFIX}.jit -XX:+PrintAssembly" {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6279) the digest of VolcanoRuleMatch matched different table sources with same field names may be same
godfrey he created FLINK-6279: - Summary: the digest of VolcanoRuleMatch matched different table sources with same field names may be same Key: FLINK-6279 URL: https://issues.apache.org/jira/browse/FLINK-6279 Project: Flink Issue Type: Bug Components: Table API & SQL Reporter: godfrey he Assignee: godfrey he {{toString}} of {{TableSourceScan}} should contain table name and result of {{explainSource}} to distinguish the difference of table sources with same field names. Otherwise the digest of {{VolcanoRuleMatch}} matched those table sources may be same. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6278) After "FATAL" message in logs taskmanager still working
Andrey created FLINK-6278: - Summary: After "FATAL" message in logs taskmanager still working Key: FLINK-6278 URL: https://issues.apache.org/jira/browse/FLINK-6278 Project: Flink Issue Type: Bug Affects Versions: 1.2.0 Reporter: Andrey Steps to reproduce: * create job which cause OOM. For example for each incoming message create byte array and store in memory: {code} }).flatMap(new RichFlatMapFunction() { private List oomList = new ArrayList<>(); @Override public void flatMap(String value, Collector out) throws Exception { if (oomHost != null && oomHost.equals(host)) { //multiply speed towards oom oomList.add(new byte[100 * 1024]); } } } {code} * after some time task manager hangs * according to logs task manager will be disconnected from zookeeper (ha mode was configured) and job manager * then big log entry: {code} 2017-04-07 09:13:32,893 ERROR org.apache.flink.runtime.taskmanager.TaskManager - == == FATAL === == A fatal error occurred, forcing the TaskManager to shut down: Task 'Flat Map -> Sink: Unnamed (2/2)' did not react to cancelling signal in the last 30 seconds, but is stuck in method: org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:182) org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63) org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:272) org.apache.flink.runtime.taskmanager.Task.run(Task.java:655) java.lang.Thread.run(Thread.java:745) {code} * TM still running. Thread dump attached. Expected: * shutdown task manager -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6277) RuntimeException in logs during normal shutdown
Andrey created FLINK-6277: - Summary: RuntimeException in logs during normal shutdown Key: FLINK-6277 URL: https://issues.apache.org/jira/browse/FLINK-6277 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 1.2.0 Reporter: Andrey During job shutdown, I noticed Exception in logs: {code} 2017-04-07 08:17:51,995 WARN org.apache.flink.streaming.api.operators.AbstractStreamOperator - Error while emitting latency marker. java.lang.RuntimeException at org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:791) at org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.onProcessingTime(StreamSource.java:142) at org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$RepeatedTriggerTask.run(SystemProcessingTimeService.java:256) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:168) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:138) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:131) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:106) at org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104) at org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96) ... 10 more {code} It looks like RecordWriterOutput catches and re-throw all exceptions. Even InterruptedException. According to "cancel" method LatencyMarksEmitter may interrupt future. Expected: * cleanly shutdown LatencyMarksEmitter * correctly handle InterruptedException -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-6276) InvalidTypesException: Unknown Error. Type is null.
Luke Hutchison created FLINK-6276: - Summary: InvalidTypesException: Unknown Error. Type is null. Key: FLINK-6276 URL: https://issues.apache.org/jira/browse/FLINK-6276 Project: Flink Issue Type: Bug Components: Core, DataSet API Affects Versions: 1.2.0 Reporter: Luke Hutchison Quite frequently when writing Flink code, I get the exception {{InvalidTypesException: Unknown Error. Type is null.}} A small example that triggers it is: {code} import java.util.ArrayList; import java.util.Arrays; import java.util.Iterator; import java.util.List; import org.apache.flink.api.java.DataSet; import org.apache.flink.api.java.ExecutionEnvironment; import org.apache.flink.api.java.tuple.Tuple2; import org.apache.flink.util.Collector; public class TestMain { @SafeVarargs public staticDataSet > join(V missingValuePlaceholder, DataSet >... datasets) { DataSet > join = null; for (int i = 0; i < datasets.length; i++) { final int datasetIdx = i; if (datasetIdx == 0) { join = datasets[datasetIdx] // .map(t -> new Tuple2<>(t.f0, Arrays.asList(t.f1))) // .name("start join"); } else { join = join.coGroup(datasets[datasetIdx]) // .where(0).equalTo(0) // .with((Iterable > li, Iterable > ri, Collector > out) -> { K key = null; List vals = new ArrayList<>(datasetIdx + 1); Iterator > lIter = li.iterator(); if (!lIter.hasNext()) { for (int j = 0; j < datasetIdx; j++) { vals.add(missingValuePlaceholder); } } else { Tuple2 lt = lIter.next(); key = lt.f0; vals.addAll(lt.f1); if (lIter.hasNext()) { throw new RuntimeException("Got non-unique key: " + key); } } Iterator > rIter = ri.iterator(); if (!rIter.hasNext()) { vals.add(missingValuePlaceholder); } else { Tuple2 rt = rIter.next(); key = rt.f0; vals.add(rt.f1); if (rIter.hasNext()) { throw new RuntimeException("Got non-unique key: " + key); } } out.collect(new Tuple2 (key, vals)); }) // .name("join #" + datasetIdx); } } return join; } public static void main(String[] args) throws Exception { ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); DataSet > x = // env.fromElements(new Tuple2<>("a", 3), new Tuple2<>("b", 4), new Tuple2<>("c", 5)); DataSet > y = // env.fromElements(new Tuple2<>("b", 0), new Tuple2<>("c", 1), new Tuple2<>("d", 2)); DataSet > z = // env.fromElements(new Tuple2<>("c", 7), new Tuple2<>("d", 8), new Tuple2<>("e", 9)); System.out.println(join(-1, x, y, z).collect()); } } {code} The stacktrace that is triggered is: {noformat} Exception in thread "main" org.apache.flink.api.common.functions.InvalidTypesException: The return type of function 'join(TestMain.java:23)' could not be determined automatically, due to type erasure. You can give type information hints by using the returns(...) method on the result of the transformation call, or by letting your function implement the 'ResultTypeQueryable' interface. at org.apache.flink.api.java.DataSet.getType(DataSet.java:174) at org.apache.flink.api.java.operators.CoGroupOperator$CoGroupOperatorSets.where(CoGroupOperator.java:424) at com.rentlogic.buildingscores.flink.experimental.TestMain.join(TestMain.java:27) at com.rentlogic.buildingscores.flink.experimental.TestMain.main(TestMain.java:74) Caused by: org.apache.flink.api.common.functions.InvalidTypesException: Input mismatch: Unknown Error. Type is null. at