[jira] [Created] (FLINK-8182) Unable to read hdfs file system directory(which contains sub directories) recursively

2017-12-01 Thread Shashank Agarwal (JIRA)
Shashank Agarwal created FLINK-8182:
---

 Summary: Unable to read hdfs file system directory(which contains 
sub directories) recursively  
 Key: FLINK-8182
 URL: https://issues.apache.org/jira/browse/FLINK-8182
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 1.3.2
Reporter: Shashank Agarwal


Unable to read hdfs file system directory(which contains subdirectories) 
recursively, It works fine when a single directory contains only files but when 
the directory contains subdirectories it dosesn't read subdirectory files.


{code}
streamExecutionEnvironment.readTextFile("HDFS path")
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8183) Add native Avro type support to the Table API & SQL

2017-12-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8183:
---

 Summary: Add native Avro type support to the Table API & SQL
 Key: FLINK-8183
 URL: https://issues.apache.org/jira/browse/FLINK-8183
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther


Avro types can pass the Table API, however, there should be a more native 
support in order to have the best user experience. This issue is an umbrella 
issue for tasks that would improve the handling of Avro types:

Improvements could be:

- Create a Avro type information that is created from an Avro schema that maps 
to all supported Table API types (full knowledge about key and values of lists, 
maps, and union types instead of {{GenericType}})


- Convert {{Utf8}} (even in nested Avro types) to string when entering the 
Table API and convert it back if necessary

- Add scalar functions to change certain values (e.g., 
{{select('avroRecord.set("name", "Bob").set("age", 12))}}). This is in 
particular useful when a type has a lot of fields.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8184) Return raw JsonPlan instead of escaped string value in JobDetailsInfo

2017-12-01 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8184:


 Summary: Return raw JsonPlan instead of escaped string value in 
JobDetailsInfo
 Key: FLINK-8184
 URL: https://issues.apache.org/jira/browse/FLINK-8184
 Project: Flink
  Issue Type: Bug
  Components: REST
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor
 Fix For: 1.5.0


The {{JobDetailsInfo}} should pass the JsonPlan as a raw value because 
otherwise the string value will be escaped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8185) Make first level flattening optional for input and output

2017-12-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8185:
---

 Summary: Make first level flattening optional for input and output
 Key: FLINK-8185
 URL: https://issues.apache.org/jira/browse/FLINK-8185
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Reporter: Timo Walther


When converting a {{DataSet}}/{{DataStream}} into a table. A composite type is 
automatically flattened to a row of fields, e.g. {{MyPojo}} 
becomes {{Row}}. There is no possibility to keep 
{{Row>}}. This would be especially interesting for POJOs 
that should not be flattened and be converted back to the same type: 
{{toTable[MyPojo]}}. 

At the moment a user has to specify all fields in order to get the same 
behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Flink 1.4.0-RC2 Hadoop Build

2017-12-01 Thread Vijay Srinivasaraghavan
Hello,
I am trying to build and run Flink from 1.4.0-rc2 branch with hadoop binary 
2.7.0 compatibility.
Here are the steps I followed to build (I have maven 3.3.9).
===cd 
$FLINK_HOMEmvn clean install -DskipTests -Dhadoop.version=2.7.0cd 
$FLINK_HOME/flink-distmvn clean install -Dhadoop.version=2.7.0
Running Flink from 
$FLINK_HOME/flink-dist/target/flink-1.4.0-bin==
I am seeing below error error messages from the logs and it suggests something 
to do with Hadoop dependency not available?
Could someone please confirm the build steps?
org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a 
file system implementation for scheme 'hdfs'. The scheme is not directly 
supported by Flink and no Hadoop file system to support this scheme could be 
loaded. at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403) 
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318) at 
org.apache.flink.core.fs.Path.getFileSystem(Path.java:293) at 
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.(FsCheckpointStreamFactory.java:99)
 at 
org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.createCheckpointStreamFactory(StreamTask.java:787)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:247)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) 
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at 
java.lang.Thread.run(Thread.java:748)Caused by: 
org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Cannot support 
file system for 'hdfs' via Hadoop, because Hadoop is not in the classpath, or 
some classes are missing from the classpath. at 
org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:179)
 at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399) 
... 11 moreCaused by: java.lang.NoClassDefFoundError: Could not initialize 
class org.apache.hadoop.hdfs.DFSConfigKeys at 
org.apache.hadoop.hdfs.DFSClient$Conf.(DFSClient.java:509) at 
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:637) at 
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:619) at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
 at 
org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:159)
 ... 12 more2017-12-01 04:28:09,274 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: 
Collection Source -> Map (1/1) (74e8a3f21c86e8dec3c55988fff42e5d) switched from 
RUNNING to 
FAILED.org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not 
find a file system implementation for scheme 'hdfs'. The scheme is not directly 
supported by Flink and no Hadoop file system to support this scheme could be 
loaded. at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403) 
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318) at 
org.apache.flink.core.fs.Path.getFileSystem(Path.java:293) at 
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.(FsCheckpointStreamFactory.java:99)
 at 
org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.createCheckpointStreamFactory(StreamTask.java:787)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:247)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) 
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at 
java.lang.Thread.run(Thread.java:748)Caused by: 
org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Cannot support 
file system for 'hdfs' via Hadoop, because Hadoop is not in the classpath, or 
some classes are missing from the classpath. at 
org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:179)
 at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399) 
... 11 moreCaused by: java.lang.NoClassDefFoundError: Could not initialize 
class org.apache.hadoop.hdfs.DFSConfigKeys at 
org.apache.hadoop.hdfs.DFSClient$Conf.(DFSClient.java:509) at 
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:637) at 
org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:

[jira] [Created] (FLINK-8186) AvroInputFormat regression: fails to deserialize GenericRecords on cluster with hadoop27 compat

2017-12-01 Thread Sebastian Klemke (JIRA)
Sebastian Klemke created FLINK-8186:
---

 Summary: AvroInputFormat regression: fails to deserialize 
GenericRecords on cluster with hadoop27 compat
 Key: FLINK-8186
 URL: https://issues.apache.org/jira/browse/FLINK-8186
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.4.0
Reporter: Sebastian Klemke


The following job runs fine on a Flink 1.3.2 cluster, but fails on a Flink 
1.4.0 RC2 standalone cluster, "hadoop27" flavour:
{code}
public class GenericRecordCount {
public static void main(String[] args) throws Exception {
String input = ParameterTool.fromArgs(args).getRequired("input");

ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();

long count = env.readFile(new AvroInputFormat<>(new Path(input), 
GenericRecord.class), input)
.count();

System.out.printf("Counted %d records\n", count);
}
}
{code}
Runs fine in LocalExecutionEnvironment and also on no-hadoop flavour standalone 
cluster, though. Exception thrown in Flink 1.4.0 hadoop27:
{code}
12/01/2017 13:22:09 DataSource (at readFile(ExecutionEnvironment.java:514) 
(org.apache.flink.formats.avro.AvroInputFormat))(4/4) switched to FAILED
java.lang.RuntimeException: java.lang.NoSuchMethodException: 
org.apache.avro.generic.GenericRecord.()
at 
org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:353)
at 
org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:369)
at org.apache.avro.reflect.ReflectData.newRecord(ReflectData.java:901)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:212)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
at 
org.apache.flink.formats.avro.AvroInputFormat.nextRecord(AvroInputFormat.java:165)
at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:167)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodException: 
org.apache.avro.generic.GenericRecord.()
at java.lang.Class.getConstructor0(Class.java:3082)
at java.lang.Class.getDeclaredConstructor(Class.java:2178)
at 
org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:347)
... 11 more
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8187) Web client does not print errors

2017-12-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8187:
---

 Summary: Web client does not print errors
 Key: FLINK-8187
 URL: https://issues.apache.org/jira/browse/FLINK-8187
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Reporter: Timo Walther


When submitting a jar with no defined Main class, the web client does not 
respond anymore and instead of printing the REST error:

{code}
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: 
Could not run the jar.
at 
org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:90)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.util.FlinkException: Could not run the jar.
... 9 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: Neither 
a 'Main-Class', nor a 'program-class' entry was found in the jar file.
at 
org.apache.flink.client.program.PackagedProgram.getEntryPointClassNameFromJar(PackagedProgram.java:592)
at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:188)
at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
at 
org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:72)
at 
org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:69)
... 8 more
{code} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release 1.4.0, release candidate #2

2017-12-01 Thread Eron Wright
Update on reported Mesos issue (FLINK-8174):

TLDR; a PR will be ready within 24 hours that will undo reservation support.

A couple of months ago, a fix (FLINK-7294) was merged related to how Flink
accepts Mesos resource offers.  The intention was to allow Flink to make
use of so-called +reserved+ resources, a Mesos feature which makes it
possible to reserve hosts for use by a specific framework/role.  The fix
inadvertently regressed the ability to use +unreserved+ resources.  This is
a serious regression because unreserved resources are the common case.

The simple solution is to revert the earlier fix, deferring support for
reservations to another release.   We are spending some time to find a fix
that works for all scenarios, but seems unlikely at this time.   I am
reaching out to the original contributor to get their feedback.

In the course of the investigation, a related flaw was discovered in Fenzo
that causes Flink to misinterpret offers that contain a mix of reserved and
unreserved resources.   I believe that a small fix is possible purely
within Flink; an update to Fenzo does not appear necessary.

Going forward, we will contribute an improved integration test suite with
which to test Flink under diverse Mesos conditions (e.g. reservations).

Thanks,
Eron

On Thu, Nov 30, 2017 at 9:47 PM, Tzu-Li (Gordon) Tai 
wrote:

> Hi,
>
> I’ve noticed a behavioral regression in the Kafka producer, that should
> also be considered a blocker: https://issues.apache.org/
> jira/browse/FLINK-8181
> There’s already a PR for the issue here: https://github.com/
> apache/flink/pull/5108
>
> Best,
> Gordon
>
> On 30 November 2017 at 5:27:22 PM, Fabian Hueske (fhue...@gmail.com)
> wrote:
>
> I've created a JIRA issue for the the Hadoop 2.9.0 build problem [1].
>
> Best, Fabian
>
> [1] https://issues.apache.org/jira/browse/FLINK-8177
>
> 2017-11-30 4:35 GMT+01:00 Eron Wright :
>
> > Unfortunately we've identified a blocker bug for Flink on Mesos -
> > FLINK-8174. We'll have a patch ready on Thursday.
> >
> > Thanks,
> > Eron
> >
> > On Wed, Nov 29, 2017 at 3:40 PM, Eron Wright 
> wrote:
> >
> > > On Dell EMC side, we're testing the RC2 on DCOS 1.10.0. Seeing a
> > > potential issue with offer acceptance and we'll update the thread with
> a
> > +1
> > > or with a more concrete issue within 24 hours.
> > >
> > > Thanks,
> > > Eron
> > >
> > > On Wed, Nov 29, 2017 at 6:54 AM, Chesnay Schepler 
> > > wrote:
> > >
> > >> I don't think anyone has taken a look yet, nor was there a discussion
> as
> > >> to postponing it.
> > >>
> > >> It just slipped through the cracks i guess...
> > >>
> > >>
> > >> On 29.11.2017 15:47, Gyula Fóra wrote:
> > >>
> > >>> Hi guys,
> > >>> I ran into this again while playing with savepoint/restore
> parallelism:
> > >>>
> > >>> https://issues.apache.org/jira/browse/FLINK-7595
> > >>> https://github.com/apache/flink/pull/4651
> > >>>
> > >>> Anyone has some idea about the status of this PR or were we planning
> to
> > >>> postpone this to 1.5?
> > >>>
> > >>> Thanks,
> > >>> Gyula
> > >>>
> > >>>
> > >>> Fabian Hueske  ezt írta (időpont: 2017. nov. 29.,
> > >>> Sze,
> > >>> 13:10):
> > >>>
> > >>> OK, the situation is the following:
> > 
> >  The test class (org.apache.flink.yarn.UtilsTest) implements a
> Hadoop
> >  interface (Container) that was extended in Hadoop 2.9.0 by a getter
> > and
> >  setter.
> >  By adding the methods, we can compile Flink for Hadoop 2.9.0.
> However,
> >  the
> >  getter/setter add a dependency on a class that was also added in
> > Hadoop
> >  2.9.0.
> >  Therefore, the implementation is not backwards compatible with
> Hadoop
> >  versions < 2.9.0.
> > 
> >  Not sure how we can fix the problem. We would need two version of
> the
> >  class
> >  that are chosen based on the Hadoop version. Do we have something
> like
> >  that
> >  somewhere else?
> > 
> >  Since this is only a problem in a test class, Flink 1.4.0 might
> still
> >  work
> >  very well with Hadoop 2.9.0.
> >  However, this has not been tested AFAIK.
> > 
> >  Cheers, Fabian
> > 
> >  2017-11-29 12:47 GMT+01:00 Fabian Hueske :
> > 
> >  I just tried to build the release-1.4 branch for Hadoop 2.9.0
> > (released
> > > a
> > > few days ago) and got a compilation failure in a test class.
> > >
> > > Right now, I'm assessing how much we need to fix to support Hadoop
> > > 2.9.0.
> > > I'll report later.
> > >
> > > Best, Fabian
> > >
> > > 2017-11-29 11:16 GMT+01:00 Aljoscha Krettek :
> > >
> > > Agreed, this is a regression compared to the previous
> functionality.
> > I
> > >> updated the issue to "Blocker".
> > >>
> > >> On 29. Nov 2017, at 10:01, Gyula Fóra 
> wrote:
> > >>>
> > >>> Hi all,
> > >>>
> > >>> I have found the following issue:
> > >>> https://issues.apache.org/jira/browse/FLINK-8165
> > >>>
> > >

[jira] [Created] (FLINK-8188) Clean up flink-contrib

2017-12-01 Thread Bowen Li (JIRA)
Bowen Li created FLINK-8188:
---

 Summary: Clean up flink-contrib
 Key: FLINK-8188
 URL: https://issues.apache.org/jira/browse/FLINK-8188
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.5.0
Reporter: Bowen Li


This is the umbrella ticket for cleaning up flink-contrib. 

We argue that flink-contrib should be removed and all its submodules should be 
migrated to other top-level modules for the following reasons: 

1) Apache Flink the whole project itself is a result of contributions from many 
developers, there's no reason to highlight some contributions in a dedicated 
module named 'contrib'
2) flink-contrib is already too crowded and noisy. It contains lots of sub 
modules with different purposes which confuse developers and users, and they 
lack a proper project hierarchy
3) This will save us quite some build time

More details in discussions at FLINK-8175 and FLINK-8167



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8189) move flink-statebackend-rocksdb out of flink-contrib

2017-12-01 Thread Bowen Li (JIRA)
Bowen Li created FLINK-8189:
---

 Summary: move flink-statebackend-rocksdb out of flink-contrib
 Key: FLINK-8189
 URL: https://issues.apache.org/jira/browse/FLINK-8189
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Affects Versions: 1.5.0
Reporter: Bowen Li
Assignee: Bowen Li


Move {{flink-statebackend-rocksdb}} into probably its own/state-backend module 
or {{flink-runtime}} package.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release 1.4.0, release candidate #2

2017-12-01 Thread Aljoscha Krettek
Thanks for the update!

Just to be clear, you're proposing going forward with the "simple fix" of 
reverting FLINK-7294?

> On 1. Dec 2017, at 18:39, Eron Wright  wrote:
> 
> Update on reported Mesos issue (FLINK-8174):
> 
> TLDR; a PR will be ready within 24 hours that will undo reservation support.
> 
> A couple of months ago, a fix (FLINK-7294) was merged related to how Flink
> accepts Mesos resource offers.  The intention was to allow Flink to make
> use of so-called +reserved+ resources, a Mesos feature which makes it
> possible to reserve hosts for use by a specific framework/role.  The fix
> inadvertently regressed the ability to use +unreserved+ resources.  This is
> a serious regression because unreserved resources are the common case.
> 
> The simple solution is to revert the earlier fix, deferring support for
> reservations to another release.   We are spending some time to find a fix
> that works for all scenarios, but seems unlikely at this time.   I am
> reaching out to the original contributor to get their feedback.
> 
> In the course of the investigation, a related flaw was discovered in Fenzo
> that causes Flink to misinterpret offers that contain a mix of reserved and
> unreserved resources.   I believe that a small fix is possible purely
> within Flink; an update to Fenzo does not appear necessary.
> 
> Going forward, we will contribute an improved integration test suite with
> which to test Flink under diverse Mesos conditions (e.g. reservations).
> 
> Thanks,
> Eron
> 
> On Thu, Nov 30, 2017 at 9:47 PM, Tzu-Li (Gordon) Tai 
> wrote:
> 
>> Hi,
>> 
>> I’ve noticed a behavioral regression in the Kafka producer, that should
>> also be considered a blocker: https://issues.apache.org/
>> jira/browse/FLINK-8181
>> There’s already a PR for the issue here: https://github.com/
>> apache/flink/pull/5108
>> 
>> Best,
>> Gordon
>> 
>> On 30 November 2017 at 5:27:22 PM, Fabian Hueske (fhue...@gmail.com)
>> wrote:
>> 
>> I've created a JIRA issue for the the Hadoop 2.9.0 build problem [1].
>> 
>> Best, Fabian
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-8177
>> 
>> 2017-11-30 4:35 GMT+01:00 Eron Wright :
>> 
>>> Unfortunately we've identified a blocker bug for Flink on Mesos -
>>> FLINK-8174. We'll have a patch ready on Thursday.
>>> 
>>> Thanks,
>>> Eron
>>> 
>>> On Wed, Nov 29, 2017 at 3:40 PM, Eron Wright 
>> wrote:
>>> 
 On Dell EMC side, we're testing the RC2 on DCOS 1.10.0. Seeing a
 potential issue with offer acceptance and we'll update the thread with
>> a
>>> +1
 or with a more concrete issue within 24 hours.
 
 Thanks,
 Eron
 
 On Wed, Nov 29, 2017 at 6:54 AM, Chesnay Schepler 
 wrote:
 
> I don't think anyone has taken a look yet, nor was there a discussion
>> as
> to postponing it.
> 
> It just slipped through the cracks i guess...
> 
> 
> On 29.11.2017 15:47, Gyula Fóra wrote:
> 
>> Hi guys,
>> I ran into this again while playing with savepoint/restore
>> parallelism:
>> 
>> https://issues.apache.org/jira/browse/FLINK-7595
>> https://github.com/apache/flink/pull/4651
>> 
>> Anyone has some idea about the status of this PR or were we planning
>> to
>> postpone this to 1.5?
>> 
>> Thanks,
>> Gyula
>> 
>> 
>> Fabian Hueske  ezt írta (időpont: 2017. nov. 29.,
>> Sze,
>> 13:10):
>> 
>> OK, the situation is the following:
>>> 
>>> The test class (org.apache.flink.yarn.UtilsTest) implements a
>> Hadoop
>>> interface (Container) that was extended in Hadoop 2.9.0 by a getter
>>> and
>>> setter.
>>> By adding the methods, we can compile Flink for Hadoop 2.9.0.
>> However,
>>> the
>>> getter/setter add a dependency on a class that was also added in
>>> Hadoop
>>> 2.9.0.
>>> Therefore, the implementation is not backwards compatible with
>> Hadoop
>>> versions < 2.9.0.
>>> 
>>> Not sure how we can fix the problem. We would need two version of
>> the
>>> class
>>> that are chosen based on the Hadoop version. Do we have something
>> like
>>> that
>>> somewhere else?
>>> 
>>> Since this is only a problem in a test class, Flink 1.4.0 might
>> still
>>> work
>>> very well with Hadoop 2.9.0.
>>> However, this has not been tested AFAIK.
>>> 
>>> Cheers, Fabian
>>> 
>>> 2017-11-29 12:47 GMT+01:00 Fabian Hueske :
>>> 
>>> I just tried to build the release-1.4 branch for Hadoop 2.9.0
>>> (released
 a
 few days ago) and got a compilation failure in a test class.
 
 Right now, I'm assessing how much we need to fix to support Hadoop
 2.9.0.
 I'll report later.
 
 Best, Fabian
 
 2017-11-29 11:16 GMT+01:00 Aljoscha Krettek :
 
 Agreed, this is a regression compared to the previous
>> functionality.
>>> I
> updated the issue to "Blocker".
> 
>>

Re: [VOTE] Release 1.4.0, release candidate #2

2017-12-01 Thread Eron Wright
There are three levels of support we could land on.
1. Flink works with unreserved resources (revert FLINK-7294).
2. Flink works with unreserved resources, and correctly ignores reserved
resources (revert FLINK-7294 and mitigate Fenzo bug).
3. Flink works with unreserved resources and reserved resources.

3 is a moon shot.  Striving for 2.  Fallback on 1.



On Fri, Dec 1, 2017 at 2:10 PM, Aljoscha Krettek 
wrote:

> Thanks for the update!
>
> Just to be clear, you're proposing going forward with the "simple fix" of
> reverting FLINK-7294?
>
> > On 1. Dec 2017, at 18:39, Eron Wright  wrote:
> >
> > Update on reported Mesos issue (FLINK-8174):
> >
> > TLDR; a PR will be ready within 24 hours that will undo reservation
> support.
> >
> > A couple of months ago, a fix (FLINK-7294) was merged related to how
> Flink
> > accepts Mesos resource offers.  The intention was to allow Flink to make
> > use of so-called +reserved+ resources, a Mesos feature which makes it
> > possible to reserve hosts for use by a specific framework/role.  The fix
> > inadvertently regressed the ability to use +unreserved+ resources.  This
> is
> > a serious regression because unreserved resources are the common case.
> >
> > The simple solution is to revert the earlier fix, deferring support for
> > reservations to another release.   We are spending some time to find a
> fix
> > that works for all scenarios, but seems unlikely at this time.   I am
> > reaching out to the original contributor to get their feedback.
> >
> > In the course of the investigation, a related flaw was discovered in
> Fenzo
> > that causes Flink to misinterpret offers that contain a mix of reserved
> and
> > unreserved resources.   I believe that a small fix is possible purely
> > within Flink; an update to Fenzo does not appear necessary.
> >
> > Going forward, we will contribute an improved integration test suite with
> > which to test Flink under diverse Mesos conditions (e.g. reservations).
> >
> > Thanks,
> > Eron
> >
> > On Thu, Nov 30, 2017 at 9:47 PM, Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>
> > wrote:
> >
> >> Hi,
> >>
> >> I’ve noticed a behavioral regression in the Kafka producer, that should
> >> also be considered a blocker: https://issues.apache.org/
> >> jira/browse/FLINK-8181
> >> There’s already a PR for the issue here: https://github.com/
> >> apache/flink/pull/5108
> >>
> >> Best,
> >> Gordon
> >>
> >> On 30 November 2017 at 5:27:22 PM, Fabian Hueske (fhue...@gmail.com)
> >> wrote:
> >>
> >> I've created a JIRA issue for the the Hadoop 2.9.0 build problem [1].
> >>
> >> Best, Fabian
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-8177
> >>
> >> 2017-11-30 4:35 GMT+01:00 Eron Wright :
> >>
> >>> Unfortunately we've identified a blocker bug for Flink on Mesos -
> >>> FLINK-8174. We'll have a patch ready on Thursday.
> >>>
> >>> Thanks,
> >>> Eron
> >>>
> >>> On Wed, Nov 29, 2017 at 3:40 PM, Eron Wright 
> >> wrote:
> >>>
>  On Dell EMC side, we're testing the RC2 on DCOS 1.10.0. Seeing a
>  potential issue with offer acceptance and we'll update the thread with
> >> a
> >>> +1
>  or with a more concrete issue within 24 hours.
> 
>  Thanks,
>  Eron
> 
>  On Wed, Nov 29, 2017 at 6:54 AM, Chesnay Schepler  >
>  wrote:
> 
> > I don't think anyone has taken a look yet, nor was there a discussion
> >> as
> > to postponing it.
> >
> > It just slipped through the cracks i guess...
> >
> >
> > On 29.11.2017 15:47, Gyula Fóra wrote:
> >
> >> Hi guys,
> >> I ran into this again while playing with savepoint/restore
> >> parallelism:
> >>
> >> https://issues.apache.org/jira/browse/FLINK-7595
> >> https://github.com/apache/flink/pull/4651
> >>
> >> Anyone has some idea about the status of this PR or were we planning
> >> to
> >> postpone this to 1.5?
> >>
> >> Thanks,
> >> Gyula
> >>
> >>
> >> Fabian Hueske  ezt írta (időpont: 2017. nov.
> 29.,
> >> Sze,
> >> 13:10):
> >>
> >> OK, the situation is the following:
> >>>
> >>> The test class (org.apache.flink.yarn.UtilsTest) implements a
> >> Hadoop
> >>> interface (Container) that was extended in Hadoop 2.9.0 by a getter
> >>> and
> >>> setter.
> >>> By adding the methods, we can compile Flink for Hadoop 2.9.0.
> >> However,
> >>> the
> >>> getter/setter add a dependency on a class that was also added in
> >>> Hadoop
> >>> 2.9.0.
> >>> Therefore, the implementation is not backwards compatible with
> >> Hadoop
> >>> versions < 2.9.0.
> >>>
> >>> Not sure how we can fix the problem. We would need two version of
> >> the
> >>> class
> >>> that are chosen based on the Hadoop version. Do we have something
> >> like
> >>> that
> >>> somewhere else?
> >>>
> >>> Since this is only a problem in a test class, Flink 1.4.0 might
> >> still
> >>> work
> >>> very well with Hadoop 2.9.0.
> >>