[jira] [Created] (FLINK-10684) Improve the CSV reading process

2018-10-25 Thread Xingcan Cui (JIRA)
Xingcan Cui created FLINK-10684:
---

 Summary: Improve the CSV reading process
 Key: FLINK-10684
 URL: https://issues.apache.org/jira/browse/FLINK-10684
 Project: Flink
  Issue Type: Improvement
  Components: Core
Reporter: Xingcan Cui


CSV is one of the most commonly used file formats in data wrangling. To load 
records from CSV files, Flink has provided the basic {{CsvInputFormat}}, as 
well as some variants (e.g., {{RowCsvInputFormat}} and {{PojoCsvInputFormat}}). 
However, it seems that the reading process can be improved. For example, we 
could add a built-in util to automatically infer schemas from CSV headers and 
samples of data. Also, the current bad record handling method can be improved 
by somehow keeping the invalid lines (and even the reasons for failed parsing), 
instead of logging the total number only.

This is an umbrella issue for all the improvements and bug fixes for the CSV 
reading process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10683) Error while executing BLOB connection. java.io.IOException: Unknown operation

2018-10-25 Thread Yee (JIRA)
Yee created FLINK-10683:
---

 Summary: Error while executing BLOB connection. 
java.io.IOException: Unknown operation
 Key: FLINK-10683
 URL: https://issues.apache.org/jira/browse/FLINK-10683
 Project: Flink
  Issue Type: Bug
Reporter: Yee


ERROR org.apache.flink.runtime.blob.BlobServerConnection- Error 
while executing BLOB connection.
java.io.IOException: Unknown operation 5
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:35,247 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 18
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:35,550 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- PUT operation 
failed
java.io.IOException: Unknown type of BLOB addressing.
at 
org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:347)
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:127)
2018-10-26 01:49:35,854 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 3
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:36,159 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- PUT operation 
failed
java.io.IOException: Unexpected number of incoming bytes: 50353152
at 
org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:368)
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:127)
2018-10-26 01:49:36,463 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 105
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:36,765 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 71
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:37,069 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 128
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:37,373 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- PUT operation 
failed
java.io.IOException: Unexpected number of incoming bytes: 4302592
at 
org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:368)
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:127)
2018-10-26 01:49:37,676 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 115
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)
2018-10-26 01:49:37,980 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- Error while 
executing BLOB connection.
java.io.IOException: Unknown operation 71
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:136)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10682) EOFException occurs during deserialization of Avro class

2018-10-25 Thread Ben La Monica (JIRA)
Ben La Monica created FLINK-10682:
-

 Summary: EOFException occurs during deserialization of Avro class
 Key: FLINK-10682
 URL: https://issues.apache.org/jira/browse/FLINK-10682
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 1.5.4
 Environment: AWS EMR 5.17 (upgraded to Flink 1.5.4)
3 task managers, 1 job manager running in YARN in Hadoop
Running on Amazon Linux with OpenJDK 1.8
Reporter: Ben La Monica


I'm having trouble (which usually occurs after an hour of processing in a 
StreamExecutionEnvironment) where I get this failure message. I'm at a loss for 
what is causing it. I'm running this in AWS on EMR 5.17 with 3 task managers 
and a job manager running in a YARN cluster and I've upgraded my flink 
libraries to 1.5.4 to bypass another serialization issue and the kerberos auth 
issues.

The avro classes that are being deserialized were generated with avro 1.8.2.
{code:java}
2018-10-22 16:12:10,680 [INFO ] class=o.a.flink.runtime.taskmanager.Task 
thread="Calculate Estimated NAV -> Split into single messages (3/10)" Calculate 
Estimated NAV -> Split into single messages (3/10) (de7d8fa77
84903a475391d0168d56f2e) switched from RUNNING to FAILED.
java.io.EOFException: null
at 
org.apache.flink.core.memory.DataInputDeserializer.readLong(DataInputDeserializer.java:219)
at 
org.apache.flink.core.memory.DataInputDeserializer.readDouble(DataInputDeserializer.java:138)
at 
org.apache.flink.formats.avro.utils.DataInputDecoder.readDouble(DataInputDecoder.java:70)
at org.apache.avro.io.ResolvingDecoder.readDouble(ResolvingDecoder.java:190)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at 
org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at 
org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:266)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
at 
org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:116)
at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
at 
org.apache.flink.formats.avro.typeutils.AvroSerializer.deserialize(AvroSerializer.java:172)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:208)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:49)
at 
org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:140)
at 
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:208)
at 
org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:116)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748){code}
Do you have any ideas on how I could further troubleshoot this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10681) elasticsearch6.ElasticsearchSinkITCase fails if wrong JNA installed

2018-10-25 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-10681:
-

 Summary: elasticsearch6.ElasticsearchSinkITCase fails if wrong JNA 
installed
 Key: FLINK-10681
 URL: https://issues.apache.org/jira/browse/FLINK-10681
 Project: Flink
  Issue Type: Bug
  Components: ElasticSearch Connector, Tests
Affects Versions: 1.6.1, 1.7.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.6.2, 1.7.0


The {{elasticsearch6.ElasticsearchSinkITCase}} fails on systems where a wrong 
JNA library is installed.

{code}
There is an incompatible JNA native library installed on this system
Expected: 5.2.0
Found:4.0.0
/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib.
To resolve this issue you may do one of the following:
 - remove or uninstall the offending library
 - set the system property jna.nosys=true
 - set jna.boot.library.path to include the path to the version of the
   jnidispatch library included with the JNA jar file you are using

at com.sun.jna.Native.(Native.java:199)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.elasticsearch.bootstrap.Natives.(Natives.java:45)
at 
org.elasticsearch.bootstrap.BootstrapInfo.isMemoryLocked(BootstrapInfo.java:50)
at 
org.elasticsearch.monitor.process.ProcessProbe.processInfo(ProcessProbe.java:130)
at 
org.elasticsearch.monitor.process.ProcessService.(ProcessService.java:44)
at 
org.elasticsearch.monitor.MonitorService.(MonitorService.java:48)
at org.elasticsearch.node.Node.(Node.java:363)
at 
org.apache.flink.streaming.connectors.elasticsearch.EmbeddedElasticsearchNodeEnvironmentImpl$PluginNode.(EmbeddedElasticsearchNodeEnvironmentImpl.java:85)
at 
org.apache.flink.streaming.connectors.elasticsearch.EmbeddedElasticsearchNodeEnvironmentImpl.start(EmbeddedElasticsearchNodeEnvironmentImpl.java:53)
at 
org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkTestBase.prepare(ElasticsearchSinkTestBase.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:113)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
{code}

I propose to solve the problem by setting the system property 
{{jna.nosys=true}} to prefer the bundled JNA library.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10680) Unable to set negative offsets for TumblingEventTimeWindow

2018-10-25 Thread Paul Lin (JIRA)
Paul Lin created FLINK-10680:


 Summary: Unable to set negative offsets for TumblingEventTimeWindow
 Key: FLINK-10680
 URL: https://issues.apache.org/jira/browse/FLINK-10680
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.6.1
Reporter: Paul Lin


The following code given in documentation throws an IllegalArgumentException: 
TumblingEventTimeWindows parameters must satisfy 0 <= offset < size.

> TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8));

By design, the offset could be negative to fit in different time zones.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Question over Incremental Snapshot vs Full Snapshot in rocksDb state backend

2018-10-25 Thread Andrey Zagrebin
Hi Chandan,

> 1. Why did we took 2 different approaches using different RocksDB apis ?
> We could have used Checkpoint api of RocksDB for fullSnapshot as well .

The reason here is partially historical. Full snapshot in RocksDB backend was 
implemented before incremental and rescaling for incremental snapshot but after 
heap backend. Full snapshot in RocksDB uses approach close to heap backend 
because Flink community plans to support the unified format for savepoints. The 
unified format would make it possible to switch backends and restore from 
savepoint. The formats still differ due to backend specifics to optimise 
snapshotting and restore but it is technically possible to unify them in future.

> 2. Is there any specific reason to use Snapshot API of rocksDB  over 
> Checkpoint api of RocksDB for fullSnapshot?

I think Checkpoint API produces separate SST file list to copy them to HDFS in 
case of incremental snapshot.

Full snapshot does not need the file list, it just needs an iterator over 
snapshotted (frozen) data. Internally RocksDB just hard-links immutable already 
existing SST files and iterates their data for Snapshot API.

Best,
Andrey


> On 24 Oct 2018, at 18:40, chandan prakash  wrote:
> 
> Thanks Tzu-Li for redirecting.
> Would also like to be corrected if my any inference from the code is 
> incorrect or incomplete.
> I am sure it will help to clear doubts of more developers like me  :)
> Thanks in advance.
> 
> Regards,
> Chandan
> 
> 
> On Wed, Oct 24, 2018 at 9:19 PM Tzu-Li (Gordon) Tai  > wrote:
> Hi,
> 
> I’m forwarding this question to Stefan (cc’ed).
> He would most likely be able to answer your question, as he has done 
> substantial work in the RocksDB state backends.
> 
> Cheers,
> Gordon
> 
> 
> On 24 October 2018 at 8:47:24 PM, chandan prakash (chandanbaran...@gmail.com 
> ) wrote:
> 
>> Hi,
>> I am new to Flink.
>> Was looking into the code to understand how Flink does FullSnapshot and 
>> Incremental Snapshot using RocksDB
>> 
>> What I understood:
>> 1. For full snapshot, we call RocksDb snapshot api which basically an 
>> iterator handle to the entries in RocksDB instance. We iterate over every 
>> entry one by one and serialize that to some distributed file system. 
>> Similarly in restore for fullSnapshot, we read the file to get every entry 
>> and apply that to the rocksDb instance one by one to fully construct the db 
>> instance.
>> 
>> 2. On the other hand in for Incremental Snapshot, we rely on RocksDB 
>> Checkpoint api to copy the sst files to HDFS/S3 incrementally.
>> Similarly on restore, we copy the sst files to local directory and 
>> instantiate rocksDB instance with the path of the directory.
>> 
>> My Question is:
>> 1. Why did we took 2 different approaches using different RocksDB apis ?
>> We could have used Checkpoint api of RocksDB for fullSnapshot as well .
>> 2. Is there any specific reason to use Snapshot API of rocksDB  over 
>> Checkpoint api of RocksDB for fullSnapshot?
>> 
>> I am sure, I am missing some important point, really curious to know that.
>> Any explanation will be really great. Thanks in advance.
>> 
>> 
>> Regards,
>> Chandan
>> 
>> 
>> 
>> 
>> 
>> --
>> Chandan Prakash
>> 
> 
> 
> -- 
> Chandan Prakash
> 



[jira] [Created] (FLINK-10679) Let TypeSerializerSchemaCompatibility.resolveSchemaCompatibility() be the entry point for compatibility checks in framework code

2018-10-25 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-10679:
---

 Summary: Let 
TypeSerializerSchemaCompatibility.resolveSchemaCompatibility() be the entry 
point for compatibility checks in framework code
 Key: FLINK-10679
 URL: https://issues.apache.org/jira/browse/FLINK-10679
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing, Type Serialization System
Reporter: Tzu-Li (Gordon) Tai
Assignee: Stephan Ewen


Currently, the state backend framework code still is exposed the now deprecated 
{{CompatibilityResult}} and relevant classes.

Instead, all compatibility checks should go through the new 
{{TypeSerializerSchemaCompatibility#resolveSchemaCompatibility}} method, and 
allow framework code to check against the more powerful 
{{TypeSerializerSchemaCompatibility}} for incompatibility / migration 
requirements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Reverse the order of fields in Flink SQL

2018-10-25 Thread wangsan
Hi Yinhua,

This is actually a bug in Flink table, you can check this issue 
https://issues.apache.org/jira/browse/FLINK-10290 
.

I opened a PR for this issue a couple of days ago, but there is still some 
problem so it’s not ready to be merged. We have used that in our internal Flink 
version, and for now it works well. May be you can take a look at it.

Best,
wangsan

> On Oct 24, 2018, at 9:31 AM, yinhua.dai  wrote:
> 
> Hi Timo,
> 
> I write simple testing code for the issue, please checkout
> https://gist.github.com/yinhua-dai/143304464270afd19b6a926531f9acb1
> 
> I write a custom table source which just use RowCsvInputformat to create the
> dataset, and use the provided CsvTableSink, and can reproduce the issue.
> 
> 
> 
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/



[jira] [Created] (FLINK-10678) Add a switch to run_test to configure if logs should be checked for errors/excepions

2018-10-25 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-10678:


 Summary: Add a switch to run_test to configure if logs should be 
checked for errors/excepions
 Key: FLINK-10678
 URL: https://issues.apache.org/jira/browse/FLINK-10678
 Project: Flink
  Issue Type: Bug
  Components: E2E Tests
Affects Versions: 1.7.0
Reporter: Dawid Wysakowicz
 Fix For: 1.7.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10677) Setup hadoop-free E2E test cron job

2018-10-25 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-10677:


 Summary: Setup hadoop-free E2E test cron job
 Key: FLINK-10677
 URL: https://issues.apache.org/jira/browse/FLINK-10677
 Project: Flink
  Issue Type: Improvement
  Components: E2E Tests, Tests, Travis
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release 1.6.2, release candidate #1

2018-10-25 Thread Chesnay Schepler

+1

- ran a local cluster
- submitted multiple jobs through WebUI
- verified availability of metrics
- verified that latency marks are still enabled by default but 
configurable via configuration


On 18.10.2018 13:18, Chesnay Schepler wrote:

Hi everyone,
Please review and vote on the release candidate #1 for the version 
1.6.2, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases 
to be deployed to dist.apache.org [2], which are signed with the key 
with fingerprint 11D464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.6.2-rc1" [5],
* website pull request listing the new release and adding announcement 
blog post [6].


The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Release Manager

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344110

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.6.2/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1186
[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.6.2-rc1

[6] https://github.com/apache/flink-web/pull/128







Re: [VOTE] Release 1.5.5, release candidate #1

2018-10-25 Thread Chesnay Schepler

+1

- ran a local cluster
- submitted multiple jobs through WebUI
- verified availability of metrics
- verified that latency marks are still enabled by default but 
configurable via configuration


On 18.10.2018 13:17, Chesnay Schepler wrote:

Hi everyone,
Please review and vote on the release candidate #1 for the version 
1.5.5, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases 
to be deployed to dist.apache.org [2], which are signed with the key 
with fingerprint 11D464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.5.5-rc1" [5],
* website pull request listing the new release and adding announcement 
blog post [6].


The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Release Manager

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344112

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.5.5/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1185
[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.5.5-rc1

[6] https://github.com/apache/flink-web/pull/127






Re: [VOTE] Release 1.5.5, release candidate #1

2018-10-25 Thread Dawid Wysakowicz
Thanks Chesnay for the investigation. Your explanations work in my case
as well. Therefore I give +1.

On 19/10/2018 13:19, Chesnay Schepler wrote:
> See the JIRA for details.
>
> On 19.10.2018 13:18, Chesnay Schepler wrote:
>> This is likely not a new issue and existed since FLINK-9234 was
>> merged. (1.4.3/1.5.0)
>>
>> On 19.10.2018 11:16, Chesnay Schepler wrote:
>>> For the test failure: https://issues.apache.org/jira/browse/FLINK-10609
>>>
>>> On 19.10.2018 11:00, Dawid Wysakowicz wrote:
 Hi,

 -1

 I've checked:
 + compiling sources works
 + verified signatures - OK

 - the release notes are missing two issues: FLINK-10259,
 FLINK-10247, and are including an issue that was actually not
 merged for this branch: FLINK-4052
 - I get the same test failures as Fabian did:

 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
 MaxPermSize=128m; support was removed in 8.0
 Running org.apache.flink.addons.hbase.HBaseConnectorITCase
 Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
 0.557 sec <<< FAILURE! - in
 org.apache.flink.addons.hbase.HBaseConnectorITCase
 org.apache.flink.addons.hbase.HBaseConnectorITCase  Time elapsed:
 0.556 sec  <<< ERROR!
 java.lang.ClassCastException:
 org.apache.commons.logging.impl.SLF4JLocationAwareLog cannot be
 cast to org.apache.commons.logging.impl.Log4JLogger

 org.apache.flink.addons.hbase.HBaseConnectorITCase  Time elapsed:
 0.557 sec  <<< ERROR!
 java.lang.NullPointerException


 Results :

 Tests in error:
 org.apache.flink.addons.hbase.HBaseConnectorITCase.org.apache.flink.addons.hbase.HBaseConnectorITCase

   Run 1:
 HBaseConnectorITCase>HBaseTestingClusterAutostarter.setUp:152 »
 ClassCast org

   Run 2:
 HBaseConnectorITCase>HBaseTestingClusterAutostarter.tearDown:243 »
 NullPointer


 On 18/10/2018 23:55, Fabian Hueske wrote:
> Hi,
>
> I checked the following things:
>
> * no dependencies added or changed since Flink 1.5.4
> * compiling the source distribution without tests succeeds
> * compiling the source distribution with tests fails (see exception
> appended below). When I restart the compilation, it goes past
> flink-hbase
> but fails later in another module.
>
> The compilation failure might be related to my environment as the
> compilation of the Flink 1.6.2 RC1 also failed due to some issue
> related to
> logging classes.
> We should check if others can compile the sources with tests.
>
> Until we don't know what causes the issue, I vote
>
> -1 to release this RC.
>
> Best, Fabian
>
> 
>
> OpenJDK 64-Bit Server VM warning: ignoring option
> MaxPermSize=128m; support
> was removed in 8.0
> Running org.apache.flink.addons.hbase.HBaseConnectorITCase
> Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
> 0.482 sec
> <<< FAILURE! - in org.apache.flink.addons.hbase.HBaseConnectorITCase
> org.apache.flink.addons.hbase.HBaseConnectorITCase  Time elapsed:
> 0.481
> sec  <<< ERROR!
> java.lang.ClassCastException:
> org.apache.commons.logging.impl.SLF4JLocationAwareLog cannot be
> cast to
> org.apache.commons.logging.impl.Log4JLogger
>
> org.apache.flink.addons.hbase.HBaseConnectorITCase  Time elapsed:
> 0.482
> sec  <<< ERROR!
> java.lang.NullPointerException
>
>
> Results :
>
> Tests in error:
> org.apache.flink.addons.hbase.HBaseConnectorITCase.org.apache.flink.addons.hbase.HBaseConnectorITCase
>
>    Run 1:
> HBaseConnectorITCase>HBaseTestingClusterAutostarter.setUp:152 »
> ClassCast org
>    Run 2:
> HBaseConnectorITCase>HBaseTestingClusterAutostarter.tearDown:243 »
> NullPointer
>
>
>
> Am Do., 18. Okt. 2018 um 13:17 Uhr schrieb Chesnay Schepler <
> ches...@apache.org>:
>
>> Hi everyone,
>> Please review and vote on the release candidate #1 for the version
>> 1.5.5, as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific
>> comments)
>>
>>
>> The complete staging area is available for your review, which
>> includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience
>> releases to
>> be deployed to dist.apache.org [2], which are signed with the key
>> with
>> fingerprint 11D464BA [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.5.5-rc1" [5],
>> * website pull request listing the new release and adding
>> announcement
>> blog post [6].
>>
>> The vote will be open for at least 72 hours. It is adopted by
>> majority
>> approval, with at least 3 PMC 

[jira] [Created] (FLINK-10676) Add 'as' method for OverWindowWithOrderBy in Java API

2018-10-25 Thread sunjincheng (JIRA)
sunjincheng created FLINK-10676:
---

 Summary: Add 'as' method for OverWindowWithOrderBy in Java API
 Key: FLINK-10676
 URL: https://issues.apache.org/jira/browse/FLINK-10676
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Affects Versions: 1.7.0
Reporter: sunjincheng
 Fix For: 1.7.0


The preceding clause of OVER Window in the traditional database is optional. 
The default is UNBOUNDED. So we can add the "as" method to 
OverWindowWithOrderBy. This way OVERWindow is written more easily. e.g.:
{code:java}
.window(Over partitionBy 'c orderBy 'proctime preceding UNBOUNDED_ROW as 
'w){code}
Can be simplified as follows:
{code:java}
.window(Over partitionBy 'c orderBy 'proctime as 'w){code}
What do you think?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10675) Fix dependency issues in sql & table integration

2018-10-25 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-10675:


 Summary: Fix dependency issues in sql & table integration
 Key: FLINK-10675
 URL: https://issues.apache.org/jira/browse/FLINK-10675
 Project: Flink
  Issue Type: Bug
  Components: CEP, SQL Client, Table API  SQL
Affects Versions: 1.7.0
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.7.0


There are two issues with dependencies:
* check for cep dependency in {{DataStreamMatchRule}} should use thread 
classloader
* we should add cep as dependency to sql-client



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10674) DistinctAccumulator.remove lead to NPE

2018-10-25 Thread ambition (JIRA)
ambition created FLINK-10674:


 Summary: DistinctAccumulator.remove lead to NPE
 Key: FLINK-10674
 URL: https://issues.apache.org/jira/browse/FLINK-10674
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 1.6.1
 Environment: Flink 1.6.0
Reporter: ambition
 Attachments: image-2018-10-25-14-46-03-373.png

Our online Flink Job run about a week,job contain sql :
{code:java}
select  `time`,  
lower(trim(os_type)) as os_type, 
count(distinct feed_id) as feed_total_view  
from  my_table 
group by `time`, lower(trim(os_type)){code}
 

  then occur NPE: 

 
{code:java}
java.lang.NullPointerException

at scala.Predef$.Long2long(Predef.scala:363)

at 
org.apache.flink.table.functions.aggfunctions.DistinctAccumulator.remove(DistinctAccumulator.scala:109)

at NonWindowedAggregationHelper$894.retract(Unknown Source)

at 
org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:124)

at 
org.apache.flink.table.runtime.aggregate.GroupAggProcessFunction.processElement(GroupAggProcessFunction.scala:39)

at 
org.apache.flink.streaming.api.operators.LegacyKeyedProcessOperator.processElement(LegacyKeyedProcessOperator.java:88)

at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)

at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)

at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)

at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)

at java.lang.Thread.run(Thread.java:745)
{code}
 

 

View DistinctAccumulator.remove
!image-2018-10-25-14-46-03-373.png!


 

this NPE should currentCnt = null lead to, so we simple handle like :
{code:java}
def remove(params: Row): Boolean = {
  if(!distinctValueMap.contains(params)){
true
  }else{
val currentCnt = distinctValueMap.get(params)
// 
if (currentCnt == null || currentCnt == 1) {
  distinctValueMap.remove(params)
  true
} else {
  var value = currentCnt - 1L
  if(value < 0){
value = 1
  }
  distinctValueMap.put(params, value)
  false
}
  }
}{code}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)