[jira] [Created] (FLINK-6280) Allow logging with Java flags

2017-04-07 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6280:
-

 Summary: Allow logging with Java flags
 Key: FLINK-6280
 URL: https://issues.apache.org/jira/browse/FLINK-6280
 Project: Flink
  Issue Type: Improvement
  Components: Startup Shell Scripts
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Allow configuring Flink's Java options with the logging prefix and log 
rotation. For example, this allows the following configurations to write 
{{.jfr}} and {{.jit}} files alongside the existing {{.log}} and {{.out}} files.

{code:language=bash|title=Configuration for Java Flight Recorder}
env.java.opts: "-XX:+UnlockCommercialFeatures -XX:+UnlockDiagnosticVMOptions 
-XX:+FlightRecorder -XX:+DebugNonSafepoints 
-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexitpath=${LOG_PREFIX}.jfr"
{code}

{code:language=bash|title=Configuration for JitWatch}
env.java.opts: "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading 
-XX:+LogCompilation -XX:LogFile=${LOG_PREFIX}.jit -XX:+PrintAssembly"
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6279) the digest of VolcanoRuleMatch matched different table sources with same field names may be same

2017-04-07 Thread godfrey he (JIRA)
godfrey he created FLINK-6279:
-

 Summary: the digest of VolcanoRuleMatch matched different table 
sources with same field names may be same
 Key: FLINK-6279
 URL: https://issues.apache.org/jira/browse/FLINK-6279
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: godfrey he
Assignee: godfrey he


{{toString}} of {{TableSourceScan}} should contain table name and result of 
{{explainSource}} to distinguish the difference of table sources with same 
field names. Otherwise the digest of {{VolcanoRuleMatch}} matched those table 
sources may be same.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6278) After "FATAL" message in logs taskmanager still working

2017-04-07 Thread Andrey (JIRA)
Andrey created FLINK-6278:
-

 Summary: After "FATAL" message in logs taskmanager still working
 Key: FLINK-6278
 URL: https://issues.apache.org/jira/browse/FLINK-6278
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Andrey


Steps to reproduce:
* create job which cause OOM. For example for each incoming message create byte 
array and store in memory:
{code}
}).flatMap(new RichFlatMapFunction() {

private List oomList = new ArrayList<>();

@Override
public void flatMap(String value, Collector out) 
throws Exception {
if (oomHost != null && oomHost.equals(host)) {
//multiply speed towards oom
oomList.add(new byte[100 * 1024]);
}
   }
 }
{code}
* after some time task manager hangs
* according to logs task manager will be disconnected from zookeeper (ha mode 
was configured) and job manager
* then big log entry:
{code}
2017-04-07 09:13:32,893 ERROR org.apache.flink.runtime.taskmanager.TaskManager  
- 
==
==  FATAL  ===
==

A fatal error occurred, forcing the TaskManager to shut down: Task 'Flat Map -> 
Sink: Unnamed (2/2)' did not react to cancelling signal in the last 30 seconds, 
but is stuck in method:
 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:182)
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:272)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
java.lang.Thread.run(Thread.java:745)
{code}
* TM still running.

Thread dump attached.

Expected:
* shutdown task manager



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6277) RuntimeException in logs during normal shutdown

2017-04-07 Thread Andrey (JIRA)
Andrey created FLINK-6277:
-

 Summary: RuntimeException in logs during normal shutdown
 Key: FLINK-6277
 URL: https://issues.apache.org/jira/browse/FLINK-6277
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.2.0
Reporter: Andrey


During job shutdown, I noticed Exception in logs:
{code}
2017-04-07 08:17:51,995 WARN  
org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error while 
emitting latency marker.
java.lang.RuntimeException
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:791)
at 
org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.onProcessingTime(StreamSource.java:142)
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$RepeatedTriggerTask.run(SystemProcessingTimeService.java:256)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:168)
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:138)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:131)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:106)
at 
org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
... 10 more
{code}

It looks like RecordWriterOutput catches and re-throw all exceptions. Even 
InterruptedException. According to "cancel" method LatencyMarksEmitter may 
interrupt future. 

Expected:
* cleanly shutdown LatencyMarksEmitter 
* correctly handle InterruptedException



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6276) InvalidTypesException: Unknown Error. Type is null.

2017-04-07 Thread Luke Hutchison (JIRA)
Luke Hutchison created FLINK-6276:
-

 Summary: InvalidTypesException: Unknown Error. Type is null.
 Key: FLINK-6276
 URL: https://issues.apache.org/jira/browse/FLINK-6276
 Project: Flink
  Issue Type: Bug
  Components: Core, DataSet API
Affects Versions: 1.2.0
Reporter: Luke Hutchison


Quite frequently when writing Flink code, I get the exception 
{{InvalidTypesException: Unknown Error. Type is null.}} 

A small example that triggers it is:

{code}
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;

import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.util.Collector;

public class TestMain {

@SafeVarargs
public static  DataSet> join(V 
missingValuePlaceholder,
DataSet>... datasets) {
DataSet> join = null;
for (int i = 0; i < datasets.length; i++) {
final int datasetIdx = i;
if (datasetIdx == 0) {
join = datasets[datasetIdx] //
.map(t -> new Tuple2<>(t.f0, Arrays.asList(t.f1))) //
.name("start join");
} else {
join = join.coGroup(datasets[datasetIdx]) //
.where(0).equalTo(0) //
.with((Iterable> li, 
Iterable> ri,
Collector> out) -> {
K key = null;
List vals = new ArrayList<>(datasetIdx + 1);
Iterator> lIter = li.iterator();
if (!lIter.hasNext()) {
for (int j = 0; j < datasetIdx; j++) {
vals.add(missingValuePlaceholder);
}
} else {
Tuple2 lt = lIter.next();
key = lt.f0;
vals.addAll(lt.f1);
if (lIter.hasNext()) {
throw new RuntimeException("Got non-unique 
key: " + key);
}
}
Iterator> rIter = ri.iterator();
if (!rIter.hasNext()) {
vals.add(missingValuePlaceholder);
} else {
Tuple2 rt = rIter.next();
key = rt.f0;
vals.add(rt.f1);
if (rIter.hasNext()) {
throw new RuntimeException("Got non-unique 
key: " + key);
}
}
out.collect(new Tuple2(key, vals));
}) //
.name("join #" + datasetIdx);
}
}
return join;
}

public static void main(String[] args) throws Exception {
ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();

DataSet> x = //
env.fromElements(new Tuple2<>("a", 3), new Tuple2<>("b", 4), 
new Tuple2<>("c", 5));
DataSet> y = //
env.fromElements(new Tuple2<>("b", 0), new Tuple2<>("c", 1), 
new Tuple2<>("d", 2));
DataSet> z = //
env.fromElements(new Tuple2<>("c", 7), new Tuple2<>("d", 8), 
new Tuple2<>("e", 9));

System.out.println(join(-1, x, y, z).collect());
}
}
{code}

The stacktrace that is triggered is:

{noformat}
Exception in thread "main" 
org.apache.flink.api.common.functions.InvalidTypesException: The return type of 
function 'join(TestMain.java:23)' could not be determined automatically, due to 
type erasure. You can give type information hints by using the returns(...) 
method on the result of the transformation call, or by letting your function 
implement the 'ResultTypeQueryable' interface.
at org.apache.flink.api.java.DataSet.getType(DataSet.java:174)
at 
org.apache.flink.api.java.operators.CoGroupOperator$CoGroupOperatorSets.where(CoGroupOperator.java:424)
at 
com.rentlogic.buildingscores.flink.experimental.TestMain.join(TestMain.java:27)
at 
com.rentlogic.buildingscores.flink.experimental.TestMain.main(TestMain.java:74)
Caused by: org.apache.flink.api.common.functions.InvalidTypesException: Input 
mismatch: Unknown Error. Type is null.
at