[jira] [Closed] (FLINK-4781) Add User Defined Function(UDF) API for Table API

2016-10-10 Thread Shaoxuan Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaoxuan Wang closed FLINK-4781.

Resolution: Duplicate

> Add User Defined Function(UDF) API for Table API
> 
>
> Key: FLINK-4781
> URL: https://issues.apache.org/jira/browse/FLINK-4781
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4781) Add User Defined Function(UDF) API for Table API

2016-10-10 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564217#comment-15564217
 ] 

sunjincheng commented on FLINK-4781:


Thanks for taking time to check this, Fabian.
when we try to create PR yesterday, we noticed "ScalarFunction" and 
"UserDefinedFunction". And yes, it seems that ScalarFunction has achieved the 
same goal as our design. In fact, we have implemented three different user 
defined functions: UDF, UDT(table)F and UDA(Aggregate)F, let me close this 
ticket and open/work on another ticket for UDTF and UDAF.

> Add User Defined Function(UDF) API for Table API
> 
>
> Key: FLINK-4781
> URL: https://issues.apache.org/jira/browse/FLINK-4781
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1091) Allow joins with the solution set using key selectors

2016-10-10 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564179#comment-15564179
 ] 

Neelesh Srinivas Salian commented on FLINK-1091:


Hi, [~vkalavri], checking if this has been resolved or still being worked on?

> Allow joins with the solution set using key selectors
> -
>
> Key: FLINK-1091
> URL: https://issues.apache.org/jira/browse/FLINK-1091
> Project: Flink
>  Issue Type: Sub-task
>  Components: Iterations
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, features
>
> Currently, the solution set may only be joined with using tuple field 
> positions.
> A possible solution can be providing explicit functions "joinWithSolution" 
> and "coGroupWithSolution" to make sure the keys used are valid. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-959) Automated bare-metal deployment of FLINK on Amazon EC2 and OpenStack instances

2016-10-10 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564177#comment-15564177
 ] 

Neelesh Srinivas Salian commented on FLINK-959:
---

Is this still applicable? [~rmetzger] or [~uce], do you know if this is needed?

> Automated bare-metal deployment of FLINK on Amazon EC2 and OpenStack instances
> --
>
> Key: FLINK-959
> URL: https://issues.apache.org/jira/browse/FLINK-959
> Project: Flink
>  Issue Type: New Feature
>  Components: Build System
>Affects Versions: pre-apache-0.5
>Reporter: Tobias
>Assignee: Tobias
>Priority: Minor
> Fix For: pre-apache-0.5
>
>
> This python script does start Amazon ec2|OpenStack instances to install 
> java+hadoop and configure hdfs/yarn via puppet. In order to run FLINK on top 
> of hadoop YARN.
> In order to install java and hadoop binaries are downloaded by the script and 
> handed over to puppet for automated provisioning.
> User-data scripts are used to install puppet (only debian) on the master and 
> slave instances. Accordingly security groups are created and configured. 
> The master instance then starts a self configuration process, so that the 
> puppet modules are set up according to the cluster structure. 
> The master  detects if hadoop YARN web interface is accessible and waits for 
> all expected nodes to be up and running. Then a stratosphere yarn session is 
> started. Taskmanager and Jobmanager memory allocations are set up in the 
> instances.cfg.
> Notes:
> - Configuration reserves 600mb for the operating system and allocates the 
> rest for the YARN node.
> - The Flink web interface is not accessible because the yarn.web.proxy throws 
> a NullpointerException
> - Only runs on Debian derivatives because it uses apt-get 
> - Tested with ubuntu-13.08
> - FLINK is still named Stratosphere
> Code at: https://github.com/tobwiens/StratopshereBareMetalProvPuppet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1497) No documentation on MiniCluster

2016-10-10 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564170#comment-15564170
 ] 

Neelesh Srinivas Salian commented on FLINK-1497:


Checking if this is still valid, unless fixed elsewhere, [~StephanEwen]

> No documentation on MiniCluster
> ---
>
> Key: FLINK-1497
> URL: https://issues.apache.org/jira/browse/FLINK-1497
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.9
>Reporter: Sergey Dudoladov
>Priority: Trivial
>  Labels: documentation
>
>  It looks like the Flink docs do not show how to run a MiniCluster. 
>  It might be worth to document this feature  and  add relevant scripts to the 
> /bin folder, e.g. start_mini.sh and stop_mini.sh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4792) Update documentation - FlinkML/QuickStart Guide

2016-10-10 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564166#comment-15564166
 ] 

Neelesh Srinivas Salian commented on FLINK-4792:


Shall I work on this if no one has begun?

> Update documentation - FlinkML/QuickStart Guide
> ---
>
> Key: FLINK-4792
> URL: https://issues.apache.org/jira/browse/FLINK-4792
> Project: Flink
>  Issue Type: Improvement
>Reporter: Thomas FOURNIER
>Priority: Trivial
>
> Hi,
> I'm going through the first steps of FlinkML/QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> To solve this issue, you need to do the following imports:
> import org.apache.flink.api.scala._  ( instead of import 
> org.apache.flink.api.scala.ExecutionEnvironment as documented).
> I think it would be relevant to update the documentation, even if this point 
> is mentioned in FAQ.
> Thanks
> Best
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4627) Use Flink's PropertiesUtil in Kinesis connector to extract typed values from config properties

2016-10-10 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564165#comment-15564165
 ] 

Neelesh Srinivas Salian commented on FLINK-4627:


Shall I work on this if no one has already begun?

> Use Flink's PropertiesUtil in Kinesis connector to extract typed values from 
> config properties 
> ---
>
> Key: FLINK-4627
> URL: https://issues.apache.org/jira/browse/FLINK-4627
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Trivial
> Fix For: 1.2.0
>
>
> Right now value extraction from config properties in the Kinesis connector is 
> using the plain methods from {{java.util.Properties}} with string parsing.
> We can use Flink's utility {{PropertiesUtil}} to do this, with lesser lines 
> of and more readable code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2442: [FLINK-4148] incorrect calculation minDist distance in Qu...

2016-10-10 Thread nssalian
Github user nssalian commented on the issue:

https://github.com/apache/flink/pull/2442
  
Seems good to me. @zentol do you have time to add some extra review on 
this? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4148) incorrect calculation distance in QuadTree

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564163#comment-15564163
 ] 

ASF GitHub Bot commented on FLINK-4148:
---

Github user nssalian commented on the issue:

https://github.com/apache/flink/pull/2442
  
Seems good to me. @zentol do you have time to add some extra review on 
this? 


> incorrect calculation distance in QuadTree
> --
>
> Key: FLINK-4148
> URL: https://issues.apache.org/jira/browse/FLINK-4148
> Project: Flink
>  Issue Type: Bug
>Reporter: Alexey Diomin
>Priority: Trivial
> Attachments: 
> 0001-FLINK-4148-incorrect-calculation-minDist-distance-in.patch
>
>
> https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/nn/QuadTree.scala#L105
> Because EuclideanDistanceMetric extends SquaredEuclideanDistanceMetric we 
> always move in first case and never reach case for math.sqrt(minDist)
> correct match first EuclideanDistanceMetric and after it 
> SquaredEuclideanDistanceMetric
> p.s. because EuclideanDistanceMetric more compute expensive and stay as 
> default DistanceMetric it's can cause some performance degradation for KNN on 
> default parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4795) CsvStringify crashes in case of tuple in tuple, t.e. ("a", True, (1,5))

2016-10-10 Thread Yakov Goldberg (JIRA)
Yakov Goldberg created FLINK-4795:
-

 Summary: CsvStringify crashes in case of tuple in tuple, t.e. 
("a", True, (1,5))
 Key: FLINK-4795
 URL: https://issues.apache.org/jira/browse/FLINK-4795
 Project: Flink
  Issue Type: Bug
  Components: Python API
Reporter: Yakov Goldberg


CsvStringify crashes in case of tuple in tuple, t.e. ("a", True, (1,5))

Looks like, mistyping in CsvStringify._map()

def _map(self, value): 
if isinstance(value, (tuple, list)): 
return "(" + b", ".join([self.map(x) for x in value]) + ")" 
else: 
return str(value) 

self._map() should be called

But this will affect write_csv() and read_csv().
write_csv() will work automatically
and read_csv() should be implemented to be able to read Tuple type.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4794) partition_by_hash() crashes if no parameter is provided

2016-10-10 Thread Yakov Goldberg (JIRA)
Yakov Goldberg created FLINK-4794:
-

 Summary: partition_by_hash() crashes if no parameter is provided
 Key: FLINK-4794
 URL: https://issues.apache.org/jira/browse/FLINK-4794
 Project: Flink
  Issue Type: Bug
  Components: Python API
Reporter: Yakov Goldberg


partition_by_hash() crashes if no parameter is provided.
Looks like a line of code was missed, check distinct()

def distinct(self, *fields): 
f = None 
if len(fields) == 0: 
f = lambda x: (x,) 
fields = (0,) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4793) Using a local method with :: notation in Java 8 causes index out of bounds

2016-10-10 Thread Ted Dunning (JIRA)
Ted Dunning created FLINK-4793:
--

 Summary: Using a local method with :: notation in Java 8 causes 
index out of bounds
 Key: FLINK-4793
 URL: https://issues.apache.org/jira/browse/FLINK-4793
 Project: Flink
  Issue Type: Bug
Reporter: Ted Dunning


I tried to use the toString method on an object as a map function:
{code}
.map(Trade::toString)
{code}
This caused an index out of bounds error:
{code}
java.lang.ArrayIndexOutOfBoundsException: -1
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:351)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:305)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:120)
at 
org.apache.flink.streaming.api.datastream.DataStream.map(DataStream.java:506)
at 
com.mapr.aggregate.AggregateTest.testAggregateTrades(AggregateTest.java:81)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68)
{code}
On the other hand, if I use a public static method, like this:
{code}
.map(Trade::fromString)
{code}
All is good. fromString and toString are defined like this:
{code}
public static Trade fromString(String s) throws IOException {
return mapper.readValue(s, Trade.class);
}

@Override
public String toString() {
return String.format("{\"%s\", %d, %d, %.2f}", symbol, time, volume, 
price);
}
{code}

This might be a viable restriction on what functions I can use, but there 
certainly should be a better error message, if so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2617: [FLINK-4705] Instrument FixedLengthRecordSorter

2016-10-10 Thread greghogan
GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/2617

[FLINK-4705] Instrument FixedLengthRecordSorter

Updates comparators with support for key normalization.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
4705_instrument_fixedlengthrecordsorter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2617


commit 10d0955251ad9965da2094682ea1d4c468602373
Author: Greg Hogan 
Date:   2016-09-28T16:55:03Z

[FLINK-4705] Instrument FixedLengthRecordSorter

Updates comparators with support for key normalization.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4705) Instrument FixedLengthRecordSorter

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563930#comment-15563930
 ] 

ASF GitHub Bot commented on FLINK-4705:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/2617

[FLINK-4705] Instrument FixedLengthRecordSorter

Updates comparators with support for key normalization.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
4705_instrument_fixedlengthrecordsorter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2617


commit 10d0955251ad9965da2094682ea1d4c468602373
Author: Greg Hogan 
Date:   2016-09-28T16:55:03Z

[FLINK-4705] Instrument FixedLengthRecordSorter

Updates comparators with support for key normalization.




> Instrument FixedLengthRecordSorter
> --
>
> Key: FLINK-4705
> URL: https://issues.apache.org/jira/browse/FLINK-4705
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>
> The {{NormalizedKeySorter}} sorts on the concatenation of (potentially 
> partial) keys plus an 8-byte pointer to the record. After sorting each 
> pointer must be dereferenced, which is not cache friendly.
> The {{FixedLengthRecordSorter}} sorts on the concatentation of full keys 
> followed by the remainder of the record. The records can then be deserialized 
> in sequence.
> Instrumenting the {{FixedLengthRecordSorter}} requires implementing the 
> comparator methods {{writereadWithKeyNormalization}} and 
> {{readWithKeyNormalization}}.
> Testing {{JaccardIndex}} on an m4.16xlarge the scale 18 runtime dropped from 
> 71.8 to 68.8 s (4.3% faster) and the scale 20 runtime dropped from 546.1 to 
> 501.8 s (8.8% faster).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4778) Update program example in /docs/setup/cli.md due to the change in FLINK-2021

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563157#comment-15563157
 ] 

ASF GitHub Bot commented on FLINK-4778:
---

Github user heytitle commented on the issue:

https://github.com/apache/flink/pull/2611
  
@fhueske you're welcome  


> Update program example in /docs/setup/cli.md due to the change in FLINK-2021
> 
>
> Key: FLINK-4778
> URL: https://issues.apache.org/jira/browse/FLINK-4778
> Project: Flink
>  Issue Type: Task
>Reporter: Pattarawat Chormai
>Priority: Trivial
>  Labels: documentation, starter
> Fix For: 1.2.0, 1.1.4
>
>
> According to FLINK-2021, ParameterTool was introduced hence all input 
> parameters need to send with its corresponding prefix such as
> {noformat} --input, --output {noformat}
> However, WordCount call in 
> https://github.com/apache/flink/blob/master/docs/setup/cli.md#examples hasn't 
> been updated yet.
> REF: 
> https://github.com/apache/flink/commit/0629e25602eefdc239e8e72d9e3c9c1a5164448e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2611: [FLINK-4778] Update program example in /docs/setup/cli.md...

2016-10-10 Thread heytitle
Github user heytitle commented on the issue:

https://github.com/apache/flink/pull/2611
  
@fhueske you're welcome 😄 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-2765) Upgrade hbase version for hadoop-2 to 1.2 release

2016-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-2765.

   Resolution: Fixed
 Assignee: Fabian Hueske
Fix Version/s: 1.2.0

Fixed for 1.2.0 with cf22006f749b195a4d65dedc90599aace9956b85

> Upgrade hbase version for hadoop-2 to 1.2 release
> -
>
> Key: FLINK-2765
> URL: https://issues.apache.org/jira/browse/FLINK-2765
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Fabian Hueske
>Priority: Minor
> Fix For: 1.2.0
>
>
> Currently 0.98.11 is used:
> {code}
> 0.98.11-hadoop2
> {code}
> Stable release for hadoop-2 is 1.2.x line
> We should upgrade to 1.2.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4311) TableInputFormat fails when reused on next split

2016-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-4311.

   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed for 1.2.0 with 3f8727921e944d1d89714f5885c2de63681d51b2

Thanks for the fix [~nielsbasjes]!

> TableInputFormat fails when reused on next split
> 
>
> Key: FLINK-4311
> URL: https://issues.apache.org/jira/browse/FLINK-4311
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Critical
> Fix For: 1.2.0, 1.1.3
>
>
> We have written a batch job that uses data from HBase by means of using the 
> TableInputFormat.
> We have found that this class sometimes fails with this exception:
> {quote}
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:155)
>   at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:152)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:47)
>   at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:147)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>   ... 10 more
> {quote}
> As you can see the ThreadPoolExecutor was terminated at this point.
> We tracked it down to the fact that 
> # the configure method opens the table
> # the open method obtains the result scanner
> # the closes method closes the table.
> If a second split arrives on the same instance then the open method will fail 
> because the table has already been closed.
> We also found that this error varies with the versions of HBase that are 
> used. I have also seen this exception:
> {quote}
> Caused by: java.io.IOException: hconnection-0x19d37183 closed
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1146)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
>   ... 37 more
> {quote}
> I found that in the [documentation of the InputFormat 
> interface|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/io/InputFormat.html]
>  is clearly states
> {quote}IMPORTANT NOTE: Input formats must be written such that an instance 
> can be opened again after it was closed. That is due to the fact that the 
> input format is used for potentially multiple splits. After a split is done, 
> the format's close function is invoked and, if another split is available, 
> the open function is invoked afterwards for the next split.{quote}
> It appears that this specific InputFormat has 

[jira] [Closed] (FLINK-4778) Update program example in /docs/setup/cli.md due to the change in FLINK-2021

2016-10-10 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-4778.

   Resolution: Fixed
Fix Version/s: 1.2.0
   1.1.4

Fixed for 1.1.x with 2203f743a6bfd2db483603d1f45e210cee935f17
Fixed for 1.2.0 with a079259f3cfe3dc0717439fd65ce5de17cf69fb5

> Update program example in /docs/setup/cli.md due to the change in FLINK-2021
> 
>
> Key: FLINK-4778
> URL: https://issues.apache.org/jira/browse/FLINK-4778
> Project: Flink
>  Issue Type: Task
>Reporter: Pattarawat Chormai
>Priority: Trivial
>  Labels: documentation, starter
> Fix For: 1.1.4, 1.2.0
>
>
> According to FLINK-2021, ParameterTool was introduced hence all input 
> parameters need to send with its corresponding prefix such as
> {noformat} --input, --output {noformat}
> However, WordCount call in 
> https://github.com/apache/flink/blob/master/docs/setup/cli.md#examples hasn't 
> been updated yet.
> REF: 
> https://github.com/apache/flink/commit/0629e25602eefdc239e8e72d9e3c9c1a5164448e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2330: FLINK-4311 Fixed several problems in TableInputFor...

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2330


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2595: [FLINK-3656] [table] Test base for logical unit te...

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2595


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3656) Rework Table API tests

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563130#comment-15563130
 ] 

ASF GitHub Bot commented on FLINK-3656:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2595


> Rework Table API tests
> --
>
> Key: FLINK-3656
> URL: https://issues.apache.org/jira/browse/FLINK-3656
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Vasia Kalavri
>  Labels: starter
>
> The Table API tests are very inefficient. At the moment It is mostly 
> end-to-end integration tests, often testing the same functionality several 
> times (Java/Scala, DataSet/DataStream).
> We should look into how we can rework the Table API tests such that:
> - long-running integration tests are converted into faster unit tests
> - common parts of DataSet and DataStream are only tested once
> - common parts of Java and Scala Table APIs are only tested once
> - duplicate tests are completely removed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4311) TableInputFormat fails when reused on next split

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563131#comment-15563131
 ] 

ASF GitHub Bot commented on FLINK-4311:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2330


> TableInputFormat fails when reused on next split
> 
>
> Key: FLINK-4311
> URL: https://issues.apache.org/jira/browse/FLINK-4311
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Critical
> Fix For: 1.1.3
>
>
> We have written a batch job that uses data from HBase by means of using the 
> TableInputFormat.
> We have found that this class sometimes fails with this exception:
> {quote}
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:155)
>   at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:152)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:47)
>   at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:147)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>   ... 10 more
> {quote}
> As you can see the ThreadPoolExecutor was terminated at this point.
> We tracked it down to the fact that 
> # the configure method opens the table
> # the open method obtains the result scanner
> # the closes method closes the table.
> If a second split arrives on the same instance then the open method will fail 
> because the table has already been closed.
> We also found that this error varies with the versions of HBase that are 
> used. I have also seen this exception:
> {quote}
> Caused by: java.io.IOException: hconnection-0x19d37183 closed
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1146)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
>   ... 37 more
> {quote}
> I found that in the [documentation of the InputFormat 
> interface|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/io/InputFormat.html]
>  is clearly states
> {quote}IMPORTANT NOTE: Input formats must be written such that an instance 
> can be opened again after it was closed. That is due to the fact that the 
> input format is used for potentially multiple splits. After a split is done, 
> the format's close function is invoked and, if another split is available, 
> the open function is invoked afterwards for the next split.{quote}
> It appears that this specific InputFormat has not been checked 

[jira] [Commented] (FLINK-4778) Update program example in /docs/setup/cli.md due to the change in FLINK-2021

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563132#comment-15563132
 ] 

ASF GitHub Bot commented on FLINK-4778:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2611


> Update program example in /docs/setup/cli.md due to the change in FLINK-2021
> 
>
> Key: FLINK-4778
> URL: https://issues.apache.org/jira/browse/FLINK-4778
> Project: Flink
>  Issue Type: Task
>Reporter: Pattarawat Chormai
>Priority: Trivial
>  Labels: documentation, starter
>
> According to FLINK-2021, ParameterTool was introduced hence all input 
> parameters need to send with its corresponding prefix such as
> {noformat} --input, --output {noformat}
> However, WordCount call in 
> https://github.com/apache/flink/blob/master/docs/setup/cli.md#examples hasn't 
> been updated yet.
> REF: 
> https://github.com/apache/flink/commit/0629e25602eefdc239e8e72d9e3c9c1a5164448e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2611: [FLINK-4778] Update program example in /docs/setup...

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2611


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4788) State backend class cannot be loaded, because fully qualified name converted to lower-case

2016-10-10 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-4788.
-
Resolution: Fixed

Fixed in
  - 1.1.3 via d619f51ac8f922c0cf1d1e789c5141076128f04e
  - 1.2.0 via 9e17cbd6b768f73299a6a344fdf44539802fb76c

> State backend class cannot be loaded, because fully qualified name converted 
> to lower-case
> --
>
> Key: FLINK-4788
> URL: https://issues.apache.org/jira/browse/FLINK-4788
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2
> Environment: Debian Linux and Mac OS X
>Reporter: Florian Koenig
>Assignee: Stephan Ewen
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> The name of the state backend in the configuration file ist converted to 
> lower-case (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L587).
> Therefore, the class cannot be found, and the state backend (e.g., RocksDB) 
> is not loaded properly (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L603).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4788) State backend class cannot be loaded, because fully qualified name converted to lower-case

2016-10-10 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-4788.
---

> State backend class cannot be loaded, because fully qualified name converted 
> to lower-case
> --
>
> Key: FLINK-4788
> URL: https://issues.apache.org/jira/browse/FLINK-4788
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2
> Environment: Debian Linux and Mac OS X
>Reporter: Florian Koenig
>Assignee: Stephan Ewen
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> The name of the state backend in the configuration file ist converted to 
> lower-case (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L587).
> Therefore, the class cannot be found, and the state backend (e.g., RocksDB) 
> is not loaded properly (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L603).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4792) Update documentation - FlinkML/QuickStart Guide

2016-10-10 Thread Thomas FOURNIER (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas FOURNIER updated FLINK-4792:
---
Summary: Update documentation - FlinkML/QuickStart Guide  (was: Update 
documentation - QuickStart - FlinkML)

> Update documentation - FlinkML/QuickStart Guide
> ---
>
> Key: FLINK-4792
> URL: https://issues.apache.org/jira/browse/FLINK-4792
> Project: Flink
>  Issue Type: Improvement
>Reporter: Thomas FOURNIER
>Priority: Trivial
>
> Hi,
> I'm going through the first steps of FlinkML/QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> To solve this issue, you need to do the following imports:
> import org.apache.flink.api.scala._  ( instead of import 
> org.apache.flink.api.scala.ExecutionEnvironment as documented).
> I think it would be relevant to update the documentation, even if this point 
> is mentioned in FAQ.
> Thanks
> Best
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4792) Update documentation - QuickStart - FlinkML

2016-10-10 Thread Thomas FOURNIER (JIRA)
Thomas FOURNIER created FLINK-4792:
--

 Summary: Update documentation - QuickStart - FlinkML
 Key: FLINK-4792
 URL: https://issues.apache.org/jira/browse/FLINK-4792
 Project: Flink
  Issue Type: Improvement
Reporter: Thomas FOURNIER
Priority: Trivial


Hi,

I'm going through the first steps of FlinkML/QuickStart guide: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html

When using env.readCsvFile:

val env = ExecutionEnvironment.getExecutionEnvironment
val survival = env.readCsvFile[(String, String, String, String)]("path/data")

I encounter the following error:

Error:(17, 69) could not find implicit value for evidence parameter of type 
org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
String)]
Error occurred in an application involving default arguments.
val survival = env.readCsvFile[(String, String, String, String)]("path/data")

To solve this issue, you need to do the following imports:

import org.apache.flink.api.scala._  ( instead of import 
org.apache.flink.api.scala.ExecutionEnvironment as documented).

I think it would be relevant to update the documentation, even if this point is 
mentioned in FAQ.

Thanks

Best

Thomas





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3930) Implement Service-Level Authorization

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562746#comment-15562746
 ] 

ASF GitHub Bot commented on FLINK-3930:
---

Github user vijikarthi commented on the issue:

https://github.com/apache/flink/pull/2425
  
@rmetzger can you please take a look at the updated patch


> Implement Service-Level Authorization
> -
>
> Key: FLINK-3930
> URL: https://issues.apache.org/jira/browse/FLINK-3930
> Project: Flink
>  Issue Type: New Feature
>  Components: Security
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>  Labels: security
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> _This issue is part of a series of improvements detailed in the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Service-level authorization is the initial authorization mechanism to ensure 
> clients (or servers) connecting to the Flink cluster are authorized to do so. 
>   The purpose is to prevent a cluster from being used by an unauthorized 
> user, whether to execute jobs, disrupt cluster functionality, or gain access 
> to secrets stored within the cluster.
> Implement service-level authorization as described in the design doc.
> - Introduce a shared secret cookie
> - Enable Akka security cookie
> - Implement data transfer authentication
> - Secure the web dashboard



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2425: FLINK-3930 Added shared secret based authorization for Fl...

2016-10-10 Thread vijikarthi
Github user vijikarthi commented on the issue:

https://github.com/apache/flink/pull/2425
  
@rmetzger can you please take a look at the updated patch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-4108) NPE in Row.productArity

2016-10-10 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reassigned FLINK-4108:
---

Assignee: Timo Walther

> NPE in Row.productArity
> ---
>
> Key: FLINK-4108
> URL: https://issues.apache.org/jira/browse/FLINK-4108
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Type 
> Serialization System
>Affects Versions: 1.1.0
>Reporter: Martin Scholl
>Assignee: Timo Walther
>
> [this is my first issue request here, please apologize if something is 
> missing]
>  JDBCInputFormat of flink 1.1-SNAPSHOT fails with an NPE in Row.productArity:
> {quote}
> java.io.IOException: Couldn't access resultSet
> at 
> org.apache.flink.api.java.io.jdbc.JDBCInputFormat.nextRecord(JDBCInputFormat.java:288)
> at 
> org.apache.flink.api.java.io.jdbc.JDBCInputFormat.nextRecord(JDBCInputFormat.java:98)
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:162)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:588)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.apache.flink.api.table.Row.productArity(Row.scala:28)
> at 
> org.apache.flink.api.java.io.jdbc.JDBCInputFormat.nextRecord(JDBCInputFormat.java:279)
> ... 4 more
> {quote}
> The code reproduce this can be found in this gist: 
> https://gist.github.com/zeitgeist/b91a60460661618ca4585e082895c616
> The reason for the NPE, I believe, is the way through which Flink creates Row 
> instances through Kryo. As Row expects the number of fields to allocate as a 
> parameter, which Kryo does not provide, the ‘fields’ member of Row ends up 
> being null. As I’m neither a reflection nor a Kryo expert, I rather leave a 
> true analysis to more knowledgable programmers.
> Part of the aforementioned example is a not very elegant workaround though a 
> custom type and a cast (function {{jdbcNoIssue}} + custom Row type {{MyRow}}) 
> to serve as a further hint towards my theory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4035) Bump Kafka producer in Kafka sink to Kafka 0.10.0.0

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562537#comment-15562537
 ] 

ASF GitHub Bot commented on FLINK-4035:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2369
  
Tests are running https://travis-ci.org/rmetzger/flink/builds/166441047


> Bump Kafka producer in Kafka sink to Kafka 0.10.0.0
> ---
>
> Key: FLINK-4035
> URL: https://issues.apache.org/jira/browse/FLINK-4035
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>Assignee: Robert Metzger
>Priority: Minor
>
> Kafka 0.10.0.0 introduced protocol changes related to the producer.  
> Published messages now include timestamps and compressed messages now include 
> relative offsets.  As it is now, brokers must decompress publisher compressed 
> messages, assign offset to them, and recompress them, which is wasteful and 
> makes it less likely that compression will be used at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4439) Error message KafkaConsumer08 when all 'bootstrap.servers' are invalid

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562534#comment-15562534
 ] 

ASF GitHub Bot commented on FLINK-4439:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2397


> Error message KafkaConsumer08 when all 'bootstrap.servers' are invalid
> --
>
> Key: FLINK-4439
> URL: https://issues.apache.org/jira/browse/FLINK-4439
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.3
>Reporter: Gheorghe Gheorghe
>Priority: Minor
> Fix For: 1.2.0
>
>
> The "flink-connector-kafka-0.8_2"  is logging the following error when all 
> 'bootstrap.servers' are invalid when passed to the FlinkKafkaConsumer08. 
> See stacktrace: 
> {code:title=stacktrace|borderStyle=solid}
> 2016-08-21 15:22:30 WARN  FlinkKafkaConsumerBase:290 - Error communicating 
> with broker inexistentKafkHost:9092 to find partitions for [testTopic].class 
> java.nio.channels.ClosedChannelException. Message: null
> 2016-08-21 15:22:30 DEBUG FlinkKafkaConsumerBase:292 - Detailed trace
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:78)
>   at 
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:68)
>   at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:91)
>   at kafka.javaapi.consumer.SimpleConsumer.send(SimpleConsumer.scala:68)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.getPartitionsForTopic(FlinkKafkaConsumer08.java:264)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.(FlinkKafkaConsumer08.java:193)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.(FlinkKafkaConsumer08.java:164)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.(FlinkKafkaConsumer08.java:131)
>   at MetricsFromKafka$.main(MetricsFromKafka.scala:38)
>   at MetricsFromKafka.main(MetricsFromKafka.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at sbt.Run.invokeMain(Run.scala:67)
>   at sbt.Run.run0(Run.scala:61)
>   at sbt.Run.sbt$Run$$execute$1(Run.scala:51)
>   at sbt.Run$$anonfun$run$1.apply$mcV$sp(Run.scala:55)
>   at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
>   at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
>   at sbt.Logger$$anon$4.apply(Logger.scala:84)
>   at sbt.TrapExit$App.run(TrapExit.scala:248)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> In the above stackrace it is hard to figure out that the actual servers 
> provided as a config cannot be resolved to a valid ip address. Moreover the 
> flink kafka consumer will try all of those servers one by one and failing to 
> get partition information.
> The suggested improvement is to fail fast and announce the user that the 
> servers provided in the 'boostrap.servers' config are invalid. If at least 
> one server is valid then the exception should not be thrown. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2369: [FLINK-4035] Add a streaming connector for Apache Kafka 0...

2016-10-10 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2369
  
Tests are running https://travis-ci.org/rmetzger/flink/builds/166441047


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4506) CsvOutputFormat defaults allowNullValues to false, even though doc and declaration says true

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562533#comment-15562533
 ] 

ASF GitHub Bot commented on FLINK-4506:
---

Github user kirill-morozov-epam commented on a diff in the pull request:

https://github.com/apache/flink/pull/2477#discussion_r82622593
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvOutputFormatTest.java 
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.io;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.flink.api.common.io.FileOutputFormat;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.FSDataInputStream;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+
+public class CsvOutputFormatTest {
+
+   private static final Path PATH = new Path("csv_output_test_file.csv");
--- End diff --

Fixed in 
https://github.com/apache/flink/pull/2477/commits/e789411651ab7a932acf15662977a13ae2833b57


> CsvOutputFormat defaults allowNullValues to false, even though doc and 
> declaration says true
> 
>
> Key: FLINK-4506
> URL: https://issues.apache.org/jira/browse/FLINK-4506
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Documentation
>Reporter: Michael Wong
>Assignee: Kirill Morozov
>Priority: Minor
>
> In the constructor, it has this
> {code}
> this.allowNullValues = false;
> {code}
> But in the setAllowNullValues() method, the doc says the allowNullValues is 
> true by default. Also, in the declaration of allowNullValues, the value is 
> set to true. It probably makes the most sense to change the constructor.
> {code}
>   /**
>* Configures the format to either allow null values (writing an empty 
> field),
>* or to throw an exception when encountering a null field.
>* 
>* by default, null values are allowed.
>*
>* @param allowNulls Flag to indicate whether the output format should 
> accept null values.
>*/
>   public void setAllowNullValues(boolean allowNulls) {
>   this.allowNullValues = allowNulls;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2477: [FLINK-4506] CsvOutputFormat defaults allowNullVal...

2016-10-10 Thread kirill-morozov-epam
Github user kirill-morozov-epam commented on a diff in the pull request:

https://github.com/apache/flink/pull/2477#discussion_r82622593
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvOutputFormatTest.java 
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.io;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.flink.api.common.io.FileOutputFormat;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.FSDataInputStream;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+
+public class CsvOutputFormatTest {
+
+   private static final Path PATH = new Path("csv_output_test_file.csv");
--- End diff --

Fixed in 
https://github.com/apache/flink/pull/2477/commits/e789411651ab7a932acf15662977a13ae2833b57


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2397: [FLINK-4439] Validate 'bootstrap.servers' config i...

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2397


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4439) Error message KafkaConsumer08 when all 'bootstrap.servers' are invalid

2016-10-10 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4439.
---
   Resolution: Fixed
Fix Version/s: 1.2.0

Issue resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/1836e08f

> Error message KafkaConsumer08 when all 'bootstrap.servers' are invalid
> --
>
> Key: FLINK-4439
> URL: https://issues.apache.org/jira/browse/FLINK-4439
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.3
>Reporter: Gheorghe Gheorghe
>Priority: Minor
> Fix For: 1.2.0
>
>
> The "flink-connector-kafka-0.8_2"  is logging the following error when all 
> 'bootstrap.servers' are invalid when passed to the FlinkKafkaConsumer08. 
> See stacktrace: 
> {code:title=stacktrace|borderStyle=solid}
> 2016-08-21 15:22:30 WARN  FlinkKafkaConsumerBase:290 - Error communicating 
> with broker inexistentKafkHost:9092 to find partitions for [testTopic].class 
> java.nio.channels.ClosedChannelException. Message: null
> 2016-08-21 15:22:30 DEBUG FlinkKafkaConsumerBase:292 - Detailed trace
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:78)
>   at 
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:68)
>   at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:91)
>   at kafka.javaapi.consumer.SimpleConsumer.send(SimpleConsumer.scala:68)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.getPartitionsForTopic(FlinkKafkaConsumer08.java:264)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.(FlinkKafkaConsumer08.java:193)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.(FlinkKafkaConsumer08.java:164)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.(FlinkKafkaConsumer08.java:131)
>   at MetricsFromKafka$.main(MetricsFromKafka.scala:38)
>   at MetricsFromKafka.main(MetricsFromKafka.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at sbt.Run.invokeMain(Run.scala:67)
>   at sbt.Run.run0(Run.scala:61)
>   at sbt.Run.sbt$Run$$execute$1(Run.scala:51)
>   at sbt.Run$$anonfun$run$1.apply$mcV$sp(Run.scala:55)
>   at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
>   at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
>   at sbt.Logger$$anon$4.apply(Logger.scala:84)
>   at sbt.TrapExit$App.run(TrapExit.scala:248)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> In the above stackrace it is hard to figure out that the actual servers 
> provided as a config cannot be resolved to a valid ip address. Moreover the 
> flink kafka consumer will try all of those servers one by one and failing to 
> get partition information.
> The suggested improvement is to fail fast and announce the user that the 
> servers provided in the 'boostrap.servers' config are invalid. If at least 
> one server is valid then the exception should not be thrown. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4791) Fix issues caused by expression reduction

2016-10-10 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4791:
---

 Summary: Fix issues caused by expression reduction
 Key: FLINK-4791
 URL: https://issues.apache.org/jira/browse/FLINK-4791
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: Timo Walther
Assignee: Timo Walther


While updating {{ExpressionTestBase}} for FLINK-4294, I noticed that many 
expressions fail when applying the {{ReduceExpressionRule}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4398) Unstable test KvStateServerHandlerTest.testSimpleQuery

2016-10-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562445#comment-15562445
 ] 

Robert Metzger commented on FLINK-4398:
---

Another instance:
{code}
Failed tests: 
  KvStateServerHandlerTest.testSimpleQuery:152 expected:<1> but was:<0>
{code}
in https://api.travis-ci.org/jobs/166416151/log.txt?deansi=true

> Unstable test KvStateServerHandlerTest.testSimpleQuery
> --
>
> Key: FLINK-4398
> URL: https://issues.apache.org/jira/browse/FLINK-4398
> Project: Flink
>  Issue Type: Bug
>Reporter: Kostas Kloudas
>Assignee: Ufuk Celebi
>  Labels: test-stability
>
> An instance can be found here:
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/152392521/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4790) FlinkML - Error of type while using readCsvFile

2016-10-10 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-4790.
---
Resolution: Won't Fix

> FlinkML - Error of type while using readCsvFile
> ---
>
> Key: FLINK-4790
> URL: https://issues.apache.org/jira/browse/FLINK-4790
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.2
>Reporter: Thomas FOURNIER
>Priority: Minor
>  Labels: patch
>
> Hi,
> I'm going through the FlinkML QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> ^
> Is it related to this issue ?
> https://issues.apache.org/jira/browse/FLINK-1255
> Thanks
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4790) FlinkML - Error of type while using readCsvFile

2016-10-10 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562442#comment-15562442
 ] 

Timo Walther commented on FLINK-4790:
-

I will close the issue for now. But you are welcome to open a PR for improving 
the documentation.

> FlinkML - Error of type while using readCsvFile
> ---
>
> Key: FLINK-4790
> URL: https://issues.apache.org/jira/browse/FLINK-4790
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.2
>Reporter: Thomas FOURNIER
>Priority: Minor
>  Labels: patch
>
> Hi,
> I'm going through the FlinkML QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> ^
> Is it related to this issue ?
> https://issues.apache.org/jira/browse/FLINK-1255
> Thanks
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562421#comment-15562421
 ] 

ASF GitHub Bot commented on FLINK-4720:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2577
  
Thanks. Looks good to merge!


> Implement an archived version of the execution graph
> 
>
> Key: FLINK-4720
> URL: https://issues.apache.org/jira/browse/FLINK-4720
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the 
> JobManager from the WebInterface, we require an archived version of the 
> ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

2016-10-10 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2577
  
Thanks. Looks good to merge!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-3706) YARNSessionCapacitySchedulerITCase.testNonexistingQueue unstable

2016-10-10 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-3706:
-

Assignee: Robert Metzger

> YARNSessionCapacitySchedulerITCase.testNonexistingQueue unstable
> 
>
> Key: FLINK-3706
> URL: https://issues.apache.org/jira/browse/FLINK-3706
> Project: Flink
>  Issue Type: Bug
>Reporter: Aljoscha Krettek
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Attachments: log.txt
>
>
> I encountered a failed test on travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3706) YARNSessionCapacitySchedulerITCase.testNonexistingQueue unstable

2016-10-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562415#comment-15562415
 ] 

Robert Metzger commented on FLINK-3706:
---

Another instance https://api.travis-ci.org/jobs/166416144/log.txt?deansi=true

I'll look into it.

> YARNSessionCapacitySchedulerITCase.testNonexistingQueue unstable
> 
>
> Key: FLINK-3706
> URL: https://issues.apache.org/jira/browse/FLINK-3706
> Project: Flink
>  Issue Type: Bug
>Reporter: Aljoscha Krettek
>Priority: Critical
>  Labels: test-stability
> Attachments: log.txt
>
>
> I encountered a failed test on travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4768) Migrate High Availability configuration options

2016-10-10 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-4768.
---

> Migrate High Availability configuration options
> ---
>
> Key: FLINK-4768
> URL: https://issues.apache.org/jira/browse/FLINK-4768
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562381#comment-15562381
 ] 

ASF GitHub Bot commented on FLINK-4720:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2577
  
I've added the Archiveable interface. I also added it to the 
ExecutionConfig, which required moving the ExecutionConfigSummary to 
flink-core. Since this move destroys the history anyway i renamed it to 
ArchivedExecutionConfig for consistency purposes.


> Implement an archived version of the execution graph
> 
>
> Key: FLINK-4720
> URL: https://issues.apache.org/jira/browse/FLINK-4720
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the 
> JobManager from the WebInterface, we require an archived version of the 
> ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4768) Migrate High Availability configuration options

2016-10-10 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-4768.
-
   Resolution: Done
 Assignee: Stephan Ewen
Fix Version/s: 1.2.0

Done in abc1657bac83c151a1a345220942b02fcde4653a

> Migrate High Availability configuration options
> ---
>
> Key: FLINK-4768
> URL: https://issues.apache.org/jira/browse/FLINK-4768
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

2016-10-10 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2577
  
I've added the Archiveable interface. I also added it to the 
ExecutionConfig, which required moving the ExecutionConfigSummary to 
flink-core. Since this move destroys the history anyway i renamed it to 
ArchivedExecutionConfig for consistency purposes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-4764) Introduce config options

2016-10-10 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-4764.
---

> Introduce config options
> 
>
> Key: FLINK-4764
> URL: https://issues.apache.org/jira/browse/FLINK-4764
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.2
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>
> It is a bit unorthodox to start a discussion via a pull request, but this 
> suggestion is best motivated via some code.
> I suggest to move away from the current model with `ConfigConstants` and move 
> to a model where an `Option` object describes a configuration option 
> completely, with default value, fallback keys.
> h2. Advantages
>   - Much simpler / easier access to values that have deprecated keys
>   - Not possible to accidentally overlook deprecated keys
>   - Key and default values are grouped together in the definition
>   - Clearly states the expected type value for each config key (string, int, 
> etc).
>   - We can improve this even further to include the description and 
> auto-generate the config docs
> h2. Example
> Simple option:
> {code}
> Option TASK_MANAGER_TMP_DIRS = new Option<>(
> "taskmanager.tmp.dirs", // config key
> System.getProperty("java.io.tmpdir"));  // default value
> {code}
> Option with multiple deprecated keys:
> {code}
> Option HA_CLUSTER_ID = new Option<>(
> "high-availability.cluster-id",  // config key
> null,// no default value
> "high-availability.zookeeper.path.namespace",  // latest 
> deprecated key
> "recovery.zookeeper.path.namespace"); // even 
> earlier deprecated key
> {code}
> Get a config value, this automatically checks deprecated keys and default 
> values:
> {code}
> final String zkQuorum = 
> configuration.getValue(ConfigOptions.HA_ZOOKEEPER_QUORUM);
> final long connTimeout = 
> configuration.getInteger(ConfigOptions.HA_ZOOKEEPER_CONN_TIMEOUT);
> {code}
> h2. Multiple Options classes
> To avoid having one huge class (like `ConfigConstants`), we can easily split 
> this up into `TaskManagerOptions`, `JobManagerOptions`, `ZooKeeperOptions`, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4768) Migrate High Availability configuration options

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562359#comment-15562359
 ] 

ASF GitHub Bot commented on FLINK-4768:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2607


> Migrate High Availability configuration options
> ---
>
> Key: FLINK-4768
> URL: https://issues.apache.org/jira/browse/FLINK-4768
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Stephan Ewen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2605: [FLINK-4764] [core] Introduce Config Options

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2605


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2607: [FLINK-4768] [core] Migrate high-availability conf...

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2607


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4764) Introduce config options

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562358#comment-15562358
 ] 

ASF GitHub Bot commented on FLINK-4764:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2605


> Introduce config options
> 
>
> Key: FLINK-4764
> URL: https://issues.apache.org/jira/browse/FLINK-4764
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.1.2
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>
> It is a bit unorthodox to start a discussion via a pull request, but this 
> suggestion is best motivated via some code.
> I suggest to move away from the current model with `ConfigConstants` and move 
> to a model where an `Option` object describes a configuration option 
> completely, with default value, fallback keys.
> h2. Advantages
>   - Much simpler / easier access to values that have deprecated keys
>   - Not possible to accidentally overlook deprecated keys
>   - Key and default values are grouped together in the definition
>   - Clearly states the expected type value for each config key (string, int, 
> etc).
>   - We can improve this even further to include the description and 
> auto-generate the config docs
> h2. Example
> Simple option:
> {code}
> Option TASK_MANAGER_TMP_DIRS = new Option<>(
> "taskmanager.tmp.dirs", // config key
> System.getProperty("java.io.tmpdir"));  // default value
> {code}
> Option with multiple deprecated keys:
> {code}
> Option HA_CLUSTER_ID = new Option<>(
> "high-availability.cluster-id",  // config key
> null,// no default value
> "high-availability.zookeeper.path.namespace",  // latest 
> deprecated key
> "recovery.zookeeper.path.namespace"); // even 
> earlier deprecated key
> {code}
> Get a config value, this automatically checks deprecated keys and default 
> values:
> {code}
> final String zkQuorum = 
> configuration.getValue(ConfigOptions.HA_ZOOKEEPER_QUORUM);
> final long connTimeout = 
> configuration.getInteger(ConfigOptions.HA_ZOOKEEPER_CONN_TIMEOUT);
> {code}
> h2. Multiple Options classes
> To avoid having one huge class (like `ConfigConstants`), we can easily split 
> this up into `TaskManagerOptions`, `JobManagerOptions`, `ZooKeeperOptions`, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3987) Add data stream connector to DistributedLog

2016-10-10 Thread Jia Zhai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562329#comment-15562329
 ] 

Jia Zhai commented on FLINK-3987:
-

Hi John and Khurrum, 
Thanks for your watching on this issue, recently I am busy on other things, 
sorry for the late response.
Please feel free to assign this issue and also Flink-3988 to yourself, if you 
are interest.
There is part of code from Yijie for your reference: 
https://github.com/yjshen/flink/commit/c06c19c1040eb238345b92523de4286974956fe4
. 

Thanks a lot.

> Add data stream connector to DistributedLog
> ---
>
> Key: FLINK-3987
> URL: https://issues.apache.org/jira/browse/FLINK-3987
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jia Zhai
>
> I would like to add a connector to DistributedLog, which recently published 
> by twitter.
> All the infomation of DistributedLog, could be found at here: 
> http://distributedlog.io/html
> And this JIRA ticket is to track the data stream connector



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4790) FlinkML - Error of type while using readCsvFile

2016-10-10 Thread Thomas FOURNIER (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562346#comment-15562346
 ] 

Thomas FOURNIER commented on FLINK-4790:


It solves this issue thanks.

Maybe it worth updating the documentation here 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
 ?

Use  ->   import org.apache.flink.api.scala._

Instead of ->   import org.apache.flink.api.scala.ExecutionEnvironment

> FlinkML - Error of type while using readCsvFile
> ---
>
> Key: FLINK-4790
> URL: https://issues.apache.org/jira/browse/FLINK-4790
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.2
>Reporter: Thomas FOURNIER
>Priority: Minor
>  Labels: patch
>
> Hi,
> I'm going through the FlinkML QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> ^
> Is it related to this issue ?
> https://issues.apache.org/jira/browse/FLINK-1255
> Thanks
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3988) Add data set connector to DistributedLog

2016-10-10 Thread Jia Zhai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562333#comment-15562333
 ] 

Jia Zhai commented on FLINK-3988:
-

Hi John and Khurrum, 
Thanks for your watching on this issue, recently I am busy on other things, 
sorry for the late response.
Please feel free to assign this issue and also Flink-3987 to yourself, if you 
are interest.
There is part of code from Yijie for your reference: 
https://github.com/yjshen/flink/commit/c06c19c1040eb238345b92523de4286974956fe4
. 

Thanks a lot.

> Add data set connector to DistributedLog
> 
>
> Key: FLINK-3988
> URL: https://issues.apache.org/jira/browse/FLINK-3988
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jia Zhai
>
> I would like to add a connector to DistributedLog, which recently published 
> by twitter.
> All the infomation of DistributedLog, could be found at here: 
> http://distributedlog.io/html
> And this JIRA ticket is to track the data set connector



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4035) Bump Kafka producer in Kafka sink to Kafka 0.10.0.0

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562320#comment-15562320
 ] 

ASF GitHub Bot commented on FLINK-4035:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2369
  
Thank you for the review. I'll address your comments, rebase again, test 
it, and if it turns green merge it ;)


> Bump Kafka producer in Kafka sink to Kafka 0.10.0.0
> ---
>
> Key: FLINK-4035
> URL: https://issues.apache.org/jira/browse/FLINK-4035
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>Assignee: Robert Metzger
>Priority: Minor
>
> Kafka 0.10.0.0 introduced protocol changes related to the producer.  
> Published messages now include timestamps and compressed messages now include 
> relative offsets.  As it is now, brokers must decompress publisher compressed 
> messages, assign offset to them, and recompress them, which is wasteful and 
> makes it less likely that compression will be used at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2369: [FLINK-4035] Add a streaming connector for Apache Kafka 0...

2016-10-10 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2369
  
Thank you for the review. I'll address your comments, rebase again, test 
it, and if it turns green merge it ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4790) FlinkML - Error of type while using readCsvFile

2016-10-10 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562316#comment-15562316
 ] 

Timo Walther commented on FLINK-4790:
-

When using the Scala API, make sure to import the package: 
{{org.apache.flink.api.scala._}}

See also: 
http://flink.apache.org/faq.html#in-scala-api-i-get-an-error-about-implicit-values-and-evidence-parameters
 

If this solves the issue, please close this issue.

> FlinkML - Error of type while using readCsvFile
> ---
>
> Key: FLINK-4790
> URL: https://issues.apache.org/jira/browse/FLINK-4790
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.2
>Reporter: Thomas FOURNIER
>Priority: Minor
>  Labels: patch
>
> Hi,
> I'm going through the FlinkML QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> ^
> Is it related to this issue ?
> https://issues.apache.org/jira/browse/FLINK-1255
> Thanks
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4497) Add support for Scala tuples and case classes to Cassandra sink

2016-10-10 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-4497:
---

Assignee: Chesnay Schepler

> Add support for Scala tuples and case classes to Cassandra sink
> ---
>
> Key: FLINK-4497
> URL: https://issues.apache.org/jira/browse/FLINK-4497
> Project: Flink
>  Issue Type: Improvement
>  Components: Cassandra Connector
>Affects Versions: 1.1.0
>Reporter: Elias Levy
>Assignee: Chesnay Schepler
>
> The new Cassandra sink only supports streams of Flink Java tuples and Java 
> POJOs that have been annotated for use by Datastax Mapper.  The sink should 
> be extended to support Scala types and case classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4783) Allow to register TypeInfoFactories manually

2016-10-10 Thread Alexander Alexandrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Alexandrov reassigned FLINK-4783:
---

Assignee: Alexander Alexandrov

> Allow to register TypeInfoFactories manually
> 
>
> Key: FLINK-4783
> URL: https://issues.apache.org/jira/browse/FLINK-4783
> Project: Flink
>  Issue Type: Improvement
>  Components: Type Serialization System
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Alexander Alexandrov
>Priority: Minor
> Fix For: 1.2.0
>
>
> The newly introduced {{TypeInfoFactories}} (FLINK-3042 and FLINK-3060) allow 
> to create {{TypeInformations}} for types which are annotated with 
> {{TypeInfo}}. This is useful if the user has control over the type for which 
> he wants to generate the {{TypeInformation}}.
> However, annotating a type is not always possible if the type comes from an 
> external library. In this case, it would be good to be able to directly 
> register a {{TypeInfoFactory}} without having to annotate the type.
> The {{TypeExtractor#registerFactory}} already has such a method. However, it 
> is declared private.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4790) FlinkML - Error of type while using readCsvFile

2016-10-10 Thread Thomas FOURNIER (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas FOURNIER updated FLINK-4790:
---
Priority: Minor  (was: Major)

> FlinkML - Error of type while using readCsvFile
> ---
>
> Key: FLINK-4790
> URL: https://issues.apache.org/jira/browse/FLINK-4790
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.2
>Reporter: Thomas FOURNIER
>Priority: Minor
>  Labels: patch
>
> Hi,
> I'm going through the FlinkML QuickStart guide: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html
> When using env.readCsvFile:
> val env = ExecutionEnvironment.getExecutionEnvironment
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> I encounter the following error:
> Error:(17, 69) could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
> String)]
> Error occurred in an application involving default arguments.
> val survival = env.readCsvFile[(String, String, String, String)]("path/data")
> ^
> Is it related to this issue ?
> https://issues.apache.org/jira/browse/FLINK-1255
> Thanks
> Thomas



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4790) FlinkML - Error of type while using readCsvFile

2016-10-10 Thread Thomas FOURNIER (JIRA)
Thomas FOURNIER created FLINK-4790:
--

 Summary: FlinkML - Error of type while using readCsvFile
 Key: FLINK-4790
 URL: https://issues.apache.org/jira/browse/FLINK-4790
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.1.2
Reporter: Thomas FOURNIER


Hi,

I'm going through the FlinkML QuickStart guide: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/libs/ml/quickstart.html

When using env.readCsvFile:

val env = ExecutionEnvironment.getExecutionEnvironment
val survival = env.readCsvFile[(String, String, String, String)]("path/data")

I encounter the following error:

Error:(17, 69) could not find implicit value for evidence parameter of type 
org.apache.flink.api.common.typeinfo.TypeInformation[(String, String, String, 
String)]
Error occurred in an application involving default arguments.
val survival = env.readCsvFile[(String, String, String, String)]("path/data")
^
Is it related to this issue ?
https://issues.apache.org/jira/browse/FLINK-1255


Thanks

Thomas




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3263) Log task statistics on TaskManager

2016-10-10 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-3263.
-
Resolution: Not A Problem

The metrics subsystem can provide this functionality.

> Log task statistics on TaskManager
> --
>
> Key: FLINK-3263
> URL: https://issues.apache.org/jira/browse/FLINK-3263
> Project: Flink
>  Issue Type: Improvement
>  Components: TaskManager
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Similar to how memory statistics can be written to the TaskMangers' log files 
> by configuring {{taskmanager.debug.memory.startLogThread}} and 
> {{taskmanager.debug.memory.logIntervalMs}}, it would be useful to have 
> statistics written for each task within a job.
> One use case is to reconstruct progress to analyze why TaskManagers take 
> different amounts of time to process the same quantity of data.
> I envision this being the same statistics which are displayed on the web 
> frontend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562293#comment-15562293
 ] 

ASF GitHub Bot commented on FLINK-4720:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2577
  
Ah i see. I removed it since it was pretty much identical to the 
ExecutionConfigSummary class; it would've been a rename essentially. I like the 
interface idea, makes it a bit more organized.


> Implement an archived version of the execution graph
> 
>
> Key: FLINK-4720
> URL: https://issues.apache.org/jira/browse/FLINK-4720
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the 
> JobManager from the WebInterface, we require an archived version of the 
> ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

2016-10-10 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2577
  
Ah i see. I removed it since it was pretty much identical to the 
ExecutionConfigSummary class; it would've been a rename essentially. I like the 
interface idea, makes it a bit more organized.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-4783) Allow to register TypeInfoFactories manually

2016-10-10 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562288#comment-15562288
 ] 

Alexander Alexandrov edited comment on FLINK-4783 at 10/10/16 1:18 PM:
---

Alright, I guess keeping {{TypeExtractor#registerFactory}} static makes sense, 
but it still does not solve the problem with types from an external library.

Maybe delegating through {{ExecutionConfig}} as suggested by [~till.rohrmann] 
is a cleaner way that will provide means to throw an exception if the 
registration at a wrong position in the code.
This will also be on par with the similar method which already exists for 
registering Kryo serializers.

{code:java}
env.getConfig().registerTypeInfoFactory(Class type, Class> factory)
{code}


was (Author: aalexandrov):
Alright, I guess keeping {{TypeExtractor#registerFactory}} static makes sense, 
but it still does not solve the problem with types from an external library.

Maybe delegating through {{ExecutionConfig}} as suggested by [~till.rohrmann] 
is a cleaner way that will provide means to throw an exception if the 
registration at a wrong position in the code.
This will also be on par with the similar method which already exists for 
registering Kryo serializers.

{code:scala}
env.getConfig().registerTypeInfoFactory(Class type, Class> factory)
{code:scala}

> Allow to register TypeInfoFactories manually
> 
>
> Key: FLINK-4783
> URL: https://issues.apache.org/jira/browse/FLINK-4783
> Project: Flink
>  Issue Type: Improvement
>  Components: Type Serialization System
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Priority: Minor
> Fix For: 1.2.0
>
>
> The newly introduced {{TypeInfoFactories}} (FLINK-3042 and FLINK-3060) allow 
> to create {{TypeInformations}} for types which are annotated with 
> {{TypeInfo}}. This is useful if the user has control over the type for which 
> he wants to generate the {{TypeInformation}}.
> However, annotating a type is not always possible if the type comes from an 
> external library. In this case, it would be good to be able to directly 
> register a {{TypeInfoFactory}} without having to annotate the type.
> The {{TypeExtractor#registerFactory}} already has such a method. However, it 
> is declared private.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4788) State backend class cannot be loaded, because fully qualified name converted to lower-case

2016-10-10 Thread Florian Koenig (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562289#comment-15562289
 ] 

Florian Koenig commented on FLINK-4788:
---

Thanks for the quick fix!

Sorry about the links to the wrong repository in the original description. For 
the record, here are the correct links to the mentioned locations in the code 
(at the time of the submission):

https://github.com/apache/flink/blob/0dac7ad00a25c4b038b57009f41f5512f8ce73c3/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L755

https://github.com/apache/flink/blob/0dac7ad00a25c4b038b57009f41f5512f8ce73c3/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L773

> State backend class cannot be loaded, because fully qualified name converted 
> to lower-case
> --
>
> Key: FLINK-4788
> URL: https://issues.apache.org/jira/browse/FLINK-4788
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2
> Environment: Debian Linux and Mac OS X
>Reporter: Florian Koenig
>Assignee: Stephan Ewen
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> The name of the state backend in the configuration file ist converted to 
> lower-case (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L587).
> Therefore, the class cannot be found, and the state backend (e.g., RocksDB) 
> is not loaded properly (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L603).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4783) Allow to register TypeInfoFactories manually

2016-10-10 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562288#comment-15562288
 ] 

Alexander Alexandrov commented on FLINK-4783:
-

Alright, I guess keeping {{TypeExtractor#registerFactory}} static makes sense, 
but it still does not solve the problem with types from an external library.

Maybe delegating through {{ExecutionConfig}} as suggested by [~till.rohrmann] 
is a cleaner way that will provide means to throw an exception if the 
registration at a wrong position in the code.
This will also be on par with the similar method which already exists for 
registering Kryo serializers.

{code:scala}
env.getConfig().registerTypeInfoFactory(Class type, Class> factory)
{code:scala}

> Allow to register TypeInfoFactories manually
> 
>
> Key: FLINK-4783
> URL: https://issues.apache.org/jira/browse/FLINK-4783
> Project: Flink
>  Issue Type: Improvement
>  Components: Type Serialization System
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Priority: Minor
> Fix For: 1.2.0
>
>
> The newly introduced {{TypeInfoFactories}} (FLINK-3042 and FLINK-3060) allow 
> to create {{TypeInformations}} for types which are annotated with 
> {{TypeInfo}}. This is useful if the user has control over the type for which 
> he wants to generate the {{TypeInformation}}.
> However, annotating a type is not always possible if the type comes from an 
> external library. In this case, it would be good to be able to directly 
> register a {{TypeInfoFactory}} without having to annotate the type.
> The {{TypeExtractor#registerFactory}} already has such a method. However, it 
> is declared private.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4789) Avoid Kafka partition discovery on restore and share consumer instance for discovery and data consumption

2016-10-10 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4789:
-

 Summary: Avoid Kafka partition discovery on restore and share 
consumer instance for discovery and data consumption
 Key: FLINK-4789
 URL: https://issues.apache.org/jira/browse/FLINK-4789
 Project: Flink
  Issue Type: Improvement
  Components: Kafka Connector
Affects Versions: 1.2.0
Reporter: Robert Metzger


As part of FLINK-4379, the Kafka partition discovery was moved from the 
Constructor to the open() method. This is in general a good change, as outlined 
in FLINK-4155, as it allows us to detect new partitions and topics based on 
regex on the fly.

However, currently the partitions are discovered on restore as well. 
Also, the {{FlinkKafkaConsumer09.getKafkaPartitions()}} is creating a separate 
{{KafkaConsumer}} just for the partition discovery.
Since the partition discovery happens on the task managers now, we can use the 
regular {{KafkaConsumer}} instance, which is used for data retrieval as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562255#comment-15562255
 ] 

ASF GitHub Bot commented on FLINK-4720:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2577
  
>I am using the ExecutionConfigSummary (and removed my original 
ArchivedExecutionConfig) when i rebased, so I'm not sure what you mean :/

That's what I meant, I wondered why you didn't use your archive design 
pattern for `ExecutionConfig`.

What do you think about the `Archiveable` interface 
I proposed? 




> Implement an archived version of the execution graph
> 
>
> Key: FLINK-4720
> URL: https://issues.apache.org/jira/browse/FLINK-4720
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the 
> JobManager from the WebInterface, we require an archived version of the 
> ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

2016-10-10 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2577
  
>I am using the ExecutionConfigSummary (and removed my original 
ArchivedExecutionConfig) when i rebased, so I'm not sure what you mean :/

That's what I meant, I wondered why you didn't use your archive design 
pattern for `ExecutionConfig`.

What do you think about the `Archiveable` interface 
I proposed? 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4673) TypeInfoFactory for Either type

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562254#comment-15562254
 ] 

ASF GitHub Bot commented on FLINK-4673:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2545#discussion_r82598976
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java 
---
@@ -675,38 +673,6 @@ else if (isClassType(t) && 
Tuple.class.isAssignableFrom(typeToClass(t))) {
return new TupleTypeInfo(typeToClass(t), subTypesInfo);

}
-   // check if type is a subclass of Either
-   else if (isClassType(t) && 
Either.class.isAssignableFrom(typeToClass(t))) {
-   Type curT = t;
-
-   // go up the hierarchy until we reach Either (with or 
without generics)
-   // collect the types while moving up for a later 
top-down
-   while (!(isClassType(curT) && 
typeToClass(curT).equals(Either.class))) {
-   typeHierarchy.add(curT);
-   curT = typeToClass(curT).getGenericSuperclass();
-   }
-
-   // check if Either has generics
-   if (curT instanceof Class) {
-   throw new InvalidTypesException("Either needs 
to be parameterized by using generics.");
-   }
-
-   typeHierarchy.add(curT);
-
-   // create the type information for the subtypes
-   final TypeInformation[] subTypesInfo = 
createSubTypesInfo(t, (ParameterizedType) curT, typeHierarchy, in1Type, 
in2Type, false);
-   // type needs to be treated a pojo due to additional 
fields
-   if (subTypesInfo == null) {
-   if (t instanceof ParameterizedType) {
-   return (TypeInformation) 
analyzePojo(typeToClass(t), new ArrayList(typeHierarchy), 
(ParameterizedType) t, in1Type, in2Type);
--- End diff --

Sorry, for the delay. I will have a look at it again.


> TypeInfoFactory for Either type
> ---
>
> Key: FLINK-4673
> URL: https://issues.apache.org/jira/browse/FLINK-4673
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> I was able to resolve the requirement to specify an explicit 
> {{TypeInformation}} in the pull request for FLINK-4624 by creating a 
> {{TypeInfoFactory}} for the {{Either}} type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2545: [FLINK-4673] [core] TypeInfoFactory for Either typ...

2016-10-10 Thread twalthr
Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2545#discussion_r82598976
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java 
---
@@ -675,38 +673,6 @@ else if (isClassType(t) && 
Tuple.class.isAssignableFrom(typeToClass(t))) {
return new TupleTypeInfo(typeToClass(t), subTypesInfo);

}
-   // check if type is a subclass of Either
-   else if (isClassType(t) && 
Either.class.isAssignableFrom(typeToClass(t))) {
-   Type curT = t;
-
-   // go up the hierarchy until we reach Either (with or 
without generics)
-   // collect the types while moving up for a later 
top-down
-   while (!(isClassType(curT) && 
typeToClass(curT).equals(Either.class))) {
-   typeHierarchy.add(curT);
-   curT = typeToClass(curT).getGenericSuperclass();
-   }
-
-   // check if Either has generics
-   if (curT instanceof Class) {
-   throw new InvalidTypesException("Either needs 
to be parameterized by using generics.");
-   }
-
-   typeHierarchy.add(curT);
-
-   // create the type information for the subtypes
-   final TypeInformation[] subTypesInfo = 
createSubTypesInfo(t, (ParameterizedType) curT, typeHierarchy, in1Type, 
in2Type, false);
-   // type needs to be treated a pojo due to additional 
fields
-   if (subTypesInfo == null) {
-   if (t instanceof ParameterizedType) {
-   return (TypeInformation) 
analyzePojo(typeToClass(t), new ArrayList(typeHierarchy), 
(ParameterizedType) t, in1Type, in2Type);
--- End diff --

Sorry, for the delay. I will have a look at it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

2016-10-10 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2577
  
I did not want to rename all usages of ExecutionGraph, correct. Otherwise i 
would've went with exactly the names you suggested.

I am using the ExecutionConfigSummary (and removed my original 
ArchivedExecutionConfig) when i rebased, so I'm not sure what you mean :/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562237#comment-15562237
 ] 

ASF GitHub Bot commented on FLINK-4720:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2577
  
I did not want to rename all usages of ExecutionGraph, correct. Otherwise i 
would've went with exactly the names you suggested.

I am using the ExecutionConfigSummary (and removed my original 
ArchivedExecutionConfig) when i rebased, so I'm not sure what you mean :/


> Implement an archived version of the execution graph
> 
>
> Key: FLINK-4720
> URL: https://issues.apache.org/jira/browse/FLINK-4720
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the 
> JobManager from the WebInterface, we require an archived version of the 
> ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4788) State backend class cannot be loaded, because fully qualified name converted to lower-case

2016-10-10 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562226#comment-15562226
 ] 

Stephan Ewen commented on FLINK-4788:
-

1.1.3 fix in d619f51ac8f922c0cf1d1e789c5141076128f04e

> State backend class cannot be loaded, because fully qualified name converted 
> to lower-case
> --
>
> Key: FLINK-4788
> URL: https://issues.apache.org/jira/browse/FLINK-4788
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2
> Environment: Debian Linux and Mac OS X
>Reporter: Florian Koenig
>Assignee: Stephan Ewen
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> The name of the state backend in the configuration file ist converted to 
> lower-case (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L587).
> Therefore, the class cannot be found, and the state backend (e.g., RocksDB) 
> is not loaded properly (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L603).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4023) Move Kafka consumer partition discovery from constructor to open()

2016-10-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562211#comment-15562211
 ] 

Robert Metzger edited comment on FLINK-4023 at 10/10/16 12:48 PM:
--

This has been resolved in FLINK-4379. 
I'm sorry that the communication didn't properly happen here. (@[~srichter], 
[~tzulitai] was assigned to that task, but I think its okay for Gordon that 
this is resolved. For the future, it would be nice if you could check if 
there's already a JIRA for a change)


was (Author: rmetzger):
This has been resolved in FLINK-4379.

> Move Kafka consumer partition discovery from constructor to open()
> --
>
> Key: FLINK-4023
> URL: https://issues.apache.org/jira/browse/FLINK-4023
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Michal Harish
>Assignee: Stefan Richter
>Priority: Minor
>  Labels: kafka, kafka-0.8
> Fix For: 1.2.0
>
>
> Currently, Flink queries Kafka for partition information when creating the 
> Kafka consumer. This is done on the client when submitting the Flink job, 
> which requires the client to be able to fetch the partition data from Kafka 
> which may only be accessible from the cluster environment where the tasks 
> will be running. Moving the partition discovery to the open() method should 
> solve this problem.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4023) Move Kafka consumer partition discovery from constructor to open()

2016-10-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562211#comment-15562211
 ] 

Robert Metzger commented on FLINK-4023:
---

This has been resolved in FLINK-4379.

> Move Kafka consumer partition discovery from constructor to open()
> --
>
> Key: FLINK-4023
> URL: https://issues.apache.org/jira/browse/FLINK-4023
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Michal Harish
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Minor
>  Labels: kafka, kafka-0.8
> Fix For: 1.2.0
>
>
> Currently, Flink queries Kafka for partition information when creating the 
> Kafka consumer. This is done on the client when submitting the Flink job, 
> which requires the client to be able to fetch the partition data from Kafka 
> which may only be accessible from the cluster environment where the tasks 
> will be running. Moving the partition discovery to the open() method should 
> solve this problem.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4023) Move Kafka consumer partition discovery from constructor to open()

2016-10-10 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4023.
---
Resolution: Fixed
  Assignee: Stefan Richter  (was: Tzu-Li (Gordon) Tai)

> Move Kafka consumer partition discovery from constructor to open()
> --
>
> Key: FLINK-4023
> URL: https://issues.apache.org/jira/browse/FLINK-4023
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Michal Harish
>Assignee: Stefan Richter
>Priority: Minor
>  Labels: kafka, kafka-0.8
> Fix For: 1.2.0
>
>
> Currently, Flink queries Kafka for partition information when creating the 
> Kafka consumer. This is done on the client when submitting the Flink job, 
> which requires the client to be able to fetch the partition data from Kafka 
> which may only be accessible from the cluster environment where the tasks 
> will be running. Moving the partition discovery to the open() method should 
> solve this problem.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4155) Get Kafka producer partition info in open method instead of constructor

2016-10-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562207#comment-15562207
 ] 

Robert Metzger commented on FLINK-4155:
---

This issue has been resolved with FLINK-4379.

> Get Kafka producer partition info in open method instead of constructor
> ---
>
> Key: FLINK-4155
> URL: https://issues.apache.org/jira/browse/FLINK-4155
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.0, 1.0.3
>Reporter: Gyula Fora
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently the Flink Kafka producer does not really do any error handling if 
> something is wrong with the partition metadata as it is serialized with the 
> user function.
> This means that in some cases the job can go into an error loop when using 
> the checkpoints. Getting the partition info in the open method would solve 
> this problem (like restarting from a savepoint which re-runs the constructor).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-4155) Get Kafka producer partition info in open method instead of constructor

2016-10-10 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4155:
--
Comment: was deleted

(was: This issue has been resolved with FLINK-4379.)

> Get Kafka producer partition info in open method instead of constructor
> ---
>
> Key: FLINK-4155
> URL: https://issues.apache.org/jira/browse/FLINK-4155
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.0, 1.0.3
>Reporter: Gyula Fora
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.2.0
>
>
> Currently the Flink Kafka producer does not really do any error handling if 
> something is wrong with the partition metadata as it is serialized with the 
> user function.
> This means that in some cases the job can go into an error loop when using 
> the checkpoints. Getting the partition info in the open method would solve 
> this problem (like restarting from a savepoint which re-runs the constructor).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2611: [FLINK-4778] Update program example in /docs/setup/cli.md...

2016-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2611
  
Merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3656) Rework Table API tests

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562200#comment-15562200
 ] 

ASF GitHub Bot commented on FLINK-3656:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2595
  
Merging


> Rework Table API tests
> --
>
> Key: FLINK-3656
> URL: https://issues.apache.org/jira/browse/FLINK-3656
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Vasia Kalavri
>  Labels: starter
>
> The Table API tests are very inefficient. At the moment It is mostly 
> end-to-end integration tests, often testing the same functionality several 
> times (Java/Scala, DataSet/DataStream).
> We should look into how we can rework the Table API tests such that:
> - long-running integration tests are converted into faster unit tests
> - common parts of DataSet and DataStream are only tested once
> - common parts of Java and Scala Table APIs are only tested once
> - duplicate tests are completely removed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4311) TableInputFormat fails when reused on next split

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562203#comment-15562203
 ] 

ASF GitHub Bot commented on FLINK-4311:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2330
  
Merging


> TableInputFormat fails when reused on next split
> 
>
> Key: FLINK-4311
> URL: https://issues.apache.org/jira/browse/FLINK-4311
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.3
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Critical
> Fix For: 1.1.3
>
>
> We have written a batch job that uses data from HBase by means of using the 
> TableInputFormat.
> We have found that this class sometimes fails with this exception:
> {quote}
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:155)
>   at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:152)
>   at 
> org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:47)
>   at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:147)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b
>  rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>   ... 10 more
> {quote}
> As you can see the ThreadPoolExecutor was terminated at this point.
> We tracked it down to the fact that 
> # the configure method opens the table
> # the open method obtains the result scanner
> # the closes method closes the table.
> If a second split arrives on the same instance then the open method will fail 
> because the table has already been closed.
> We also found that this error varies with the versions of HBase that are 
> used. I have also seen this exception:
> {quote}
> Caused by: java.io.IOException: hconnection-0x19d37183 closed
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1146)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
>   ... 37 more
> {quote}
> I found that in the [documentation of the InputFormat 
> interface|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/io/InputFormat.html]
>  is clearly states
> {quote}IMPORTANT NOTE: Input formats must be written such that an instance 
> can be opened again after it was closed. That is due to the fact that the 
> input format is used for potentially multiple splits. After a split is done, 
> the format's close function is invoked and, if another split is available, 
> the open function is invoked afterwards for the next split.{quote}
> It appears that this specific InputFormat has not 

[GitHub] flink issue #2330: FLINK-4311 Fixed several problems in TableInputFormat

2016-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2330
  
Merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4778) Update program example in /docs/setup/cli.md due to the change in FLINK-2021

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562201#comment-15562201
 ] 

ASF GitHub Bot commented on FLINK-4778:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2611
  
Merging


> Update program example in /docs/setup/cli.md due to the change in FLINK-2021
> 
>
> Key: FLINK-4778
> URL: https://issues.apache.org/jira/browse/FLINK-4778
> Project: Flink
>  Issue Type: Task
>Reporter: Pattarawat Chormai
>Priority: Trivial
>  Labels: documentation, starter
>
> According to FLINK-2021, ParameterTool was introduced hence all input 
> parameters need to send with its corresponding prefix such as
> {noformat} --input, --output {noformat}
> However, WordCount call in 
> https://github.com/apache/flink/blob/master/docs/setup/cli.md#examples hasn't 
> been updated yet.
> REF: 
> https://github.com/apache/flink/commit/0629e25602eefdc239e8e72d9e3c9c1a5164448e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2595: [FLINK-3656] [table] Test base for logical unit testing

2016-10-10 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2595
  
Merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4743) The sqrt/power function not accept the real data types.

2016-10-10 Thread Anton Solovev (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562191#comment-15562191
 ] 

Anton Solovev commented on FLINK-4743:
--

Should a new function provide various types like sqrt(bigdecimal) or 
power(bigdecimal, double) ?

> The sqrt/power function not accept the real data types.
> ---
>
> Key: FLINK-4743
> URL: https://issues.apache.org/jira/browse/FLINK-4743
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.1.1
>Reporter: Anton Mushin
>Assignee: Anton Solovev
>
> At time calculate the sequences sql aggregate functions for real type column, 
> got exception: No applicable constructor/method found for actual parameters 
> "float, java.math.BigDecimal"
> And for column of integer type the problem does not occur.
> Code reproduce:
> {code}
> @Test
>   def test():Unit={
> val env = ExecutionEnvironment.getExecutionEnvironment
> val tEnv = TableEnvironment.getTableEnvironment(env, config)
> val ds = env.fromElements(
>   (1.0f, 1),
>   (2.0f, 2)).toTable(tEnv)
> tEnv.registerTable("MyTable", ds)
> val sqlQuery = "SELECT " +
>   "SQRT((SUM(a*a) - SUM(a)*SUM(a)/COUNT(a))/COUNT(a)) "+
>   "from (select _1 as a from MyTable)"
> tEnv.sql(sqlQuery).toDataSet[Row].collect().foreach(x=>print(s"$x "))
>   }
> {code}
> got exception:
> {noformat}
> org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>   at 
> org.apache.flink.api.table.runtime.FunctionCompiler$class.compile(FunctionCompiler.scala:37)
>   at 
> org.apache.flink.api.table.runtime.FlatMapRunner.compile(FlatMapRunner.scala:28)
>   at 
> org.apache.flink.api.table.runtime.FlatMapRunner.open(FlatMapRunner.scala:42)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:38)
>   at 
> org.apache.flink.api.common.operators.base.FlatMapOperatorBase.executeOnCollections(FlatMapOperatorBase.java:62)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:249)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:147)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:129)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:180)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:156)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:129)
>   at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:113)
>   at 
> org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:35)
>   at 
> org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:47)
>   at 
> org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:42)
>   at 
> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:637)
>   at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:547)
>   at 
> org.apache.flink.api.scala.batch.sql.AggregationsITCase.test(AggregationsITCase.scala:307)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at 

[jira] [Commented] (FLINK-4771) Compression for AvroOutputFormat

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562188#comment-15562188
 ] 

ASF GitHub Bot commented on FLINK-4771:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2612#discussion_r82594969
  
--- Diff: 
flink-batch-connectors/flink-avro/src/test/java/org/apache/flink/api/java/io/AvroOutputFormatTest.java
 ---
@@ -0,0 +1,106 @@
+package org.apache.flink.api.java.io;
+
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.io.IOException;
+import java.net.URISyntaxException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.apache.avro.file.CodecFactory;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.junit.Test;
+
+/**
+ * Tests for {@link AvroOutputFormat}
+ */
+public class AvroOutputFormatTest {
+
+@Test
+public void testSetCodecFactory() throws Exception {
+// given
+final AvroOutputFormat outputFormat = new 
AvroOutputFormat<>(DummyAvroType.class);
+
+// when
+try {
+outputFormat.setCodecFactory(CodecFactory.snappyCodec());
+} catch (Exception ex) {
+// then
+fail("unexpected exception");
+}
+}
+
+@Test
+public void testSetCodecFactoryError() throws Exception {
+// given
+boolean error = false;
+final AvroOutputFormat outputFormat = new 
AvroOutputFormat<>(DummyAvroType.class);
+
+// when
+try {
+outputFormat.setCodecFactory(null);
+} catch (Exception ex) {
+error = true;
+}
+
+// then
+assertTrue(error);
+}
+
+@Test
+public void testCompression() throws Exception {
+// given
+final Path outputPath = path("avro-output-file.avro");
+final AvroOutputFormat outputFormat = new 
AvroOutputFormat<>(outputPath, DummyAvroType.class);
+outputFormat.setWriteMode(FileSystem.WriteMode.OVERWRITE);
+
+final Path compressedOutputPath = 
path("avro-output-file-compressed.avro");
+final AvroOutputFormat compressedOutputFormat = new 
AvroOutputFormat<>(compressedOutputPath, DummyAvroType.class);
+
compressedOutputFormat.setWriteMode(FileSystem.WriteMode.OVERWRITE);
+compressedOutputFormat.setCodecFactory(CodecFactory.snappyCodec());
+
+// when
+output(outputFormat);
+output(compressedOutputFormat);
+
+// then
+assertTrue(fileSize(outputPath) > fileSize(compressedOutputPath));
+}
+
+private long fileSize(Path path) throws IOException {
+return Files.size(Paths.get(path.getPath()));
+}
+
+private void output(final AvroOutputFormat 
outputFormat) throws IOException {
+outputFormat.configure(new Configuration());
+outputFormat.open(1,1);
+for (int i = 0; i < 100; i++) {
+outputFormat.writeRecord(new DummyAvroType(1));
+}
+outputFormat.close();
+}
+
+private Path path(final String virtualPath) throws URISyntaxException {
+return new 
Path(Paths.get(getClass().getResource("/").toURI()).toString() + "/" + 
virtualPath);
--- End diff --

Please use `File.createTempFile()` to create a file in the default temp 
space.


> Compression for AvroOutputFormat
> 
>
> Key: FLINK-4771
> URL: https://issues.apache.org/jira/browse/FLINK-4771
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch Connectors and Input/Output Formats
>Reporter: Lars Bachmann
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently it is not possible to set a compression codec for the 
> AvroOutputFormat. 
> This improvement will provide a setter for the avro CodecFactory which is 
> used by the DataFileWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4506) CsvOutputFormat defaults allowNullValues to false, even though doc and declaration says true

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562187#comment-15562187
 ] 

ASF GitHub Bot commented on FLINK-4506:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2477#discussion_r82594396
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvOutputFormatTest.java 
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.io;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.flink.api.common.io.FileOutputFormat;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.FSDataInputStream;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+
+public class CsvOutputFormatTest {
+
+   private static final Path PATH = new Path("csv_output_test_file.csv");
--- End diff --

Please use `File.createTempFile()` to create a file in the default temp 
space.


> CsvOutputFormat defaults allowNullValues to false, even though doc and 
> declaration says true
> 
>
> Key: FLINK-4506
> URL: https://issues.apache.org/jira/browse/FLINK-4506
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Documentation
>Reporter: Michael Wong
>Assignee: Kirill Morozov
>Priority: Minor
>
> In the constructor, it has this
> {code}
> this.allowNullValues = false;
> {code}
> But in the setAllowNullValues() method, the doc says the allowNullValues is 
> true by default. Also, in the declaration of allowNullValues, the value is 
> set to true. It probably makes the most sense to change the constructor.
> {code}
>   /**
>* Configures the format to either allow null values (writing an empty 
> field),
>* or to throw an exception when encountering a null field.
>* 
>* by default, null values are allowed.
>*
>* @param allowNulls Flag to indicate whether the output format should 
> accept null values.
>*/
>   public void setAllowNullValues(boolean allowNulls) {
>   this.allowNullValues = allowNulls;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2612: FLINK-4771: Compression for AvroOutputFormat

2016-10-10 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2612#discussion_r82594969
  
--- Diff: 
flink-batch-connectors/flink-avro/src/test/java/org/apache/flink/api/java/io/AvroOutputFormatTest.java
 ---
@@ -0,0 +1,106 @@
+package org.apache.flink.api.java.io;
+
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.io.IOException;
+import java.net.URISyntaxException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.apache.avro.file.CodecFactory;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.junit.Test;
+
+/**
+ * Tests for {@link AvroOutputFormat}
+ */
+public class AvroOutputFormatTest {
+
+@Test
+public void testSetCodecFactory() throws Exception {
+// given
+final AvroOutputFormat outputFormat = new 
AvroOutputFormat<>(DummyAvroType.class);
+
+// when
+try {
+outputFormat.setCodecFactory(CodecFactory.snappyCodec());
+} catch (Exception ex) {
+// then
+fail("unexpected exception");
+}
+}
+
+@Test
+public void testSetCodecFactoryError() throws Exception {
+// given
+boolean error = false;
+final AvroOutputFormat outputFormat = new 
AvroOutputFormat<>(DummyAvroType.class);
+
+// when
+try {
+outputFormat.setCodecFactory(null);
+} catch (Exception ex) {
+error = true;
+}
+
+// then
+assertTrue(error);
+}
+
+@Test
+public void testCompression() throws Exception {
+// given
+final Path outputPath = path("avro-output-file.avro");
+final AvroOutputFormat outputFormat = new 
AvroOutputFormat<>(outputPath, DummyAvroType.class);
+outputFormat.setWriteMode(FileSystem.WriteMode.OVERWRITE);
+
+final Path compressedOutputPath = 
path("avro-output-file-compressed.avro");
+final AvroOutputFormat compressedOutputFormat = new 
AvroOutputFormat<>(compressedOutputPath, DummyAvroType.class);
+
compressedOutputFormat.setWriteMode(FileSystem.WriteMode.OVERWRITE);
+compressedOutputFormat.setCodecFactory(CodecFactory.snappyCodec());
+
+// when
+output(outputFormat);
+output(compressedOutputFormat);
+
+// then
+assertTrue(fileSize(outputPath) > fileSize(compressedOutputPath));
+}
+
+private long fileSize(Path path) throws IOException {
+return Files.size(Paths.get(path.getPath()));
+}
+
+private void output(final AvroOutputFormat 
outputFormat) throws IOException {
+outputFormat.configure(new Configuration());
+outputFormat.open(1,1);
+for (int i = 0; i < 100; i++) {
+outputFormat.writeRecord(new DummyAvroType(1));
+}
+outputFormat.close();
+}
+
+private Path path(final String virtualPath) throws URISyntaxException {
+return new 
Path(Paths.get(getClass().getResource("/").toURI()).toString() + "/" + 
virtualPath);
--- End diff --

Please use `File.createTempFile()` to create a file in the default temp 
space.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2477: [FLINK-4506] CsvOutputFormat defaults allowNullVal...

2016-10-10 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2477#discussion_r82594396
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/CsvOutputFormatTest.java 
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.io;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.flink.api.common.io.FileOutputFormat;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.FSDataInputStream;
+import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.core.fs.Path;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+
+public class CsvOutputFormatTest {
+
+   private static final Path PATH = new Path("csv_output_test_file.csv");
--- End diff --

Please use `File.createTempFile()` to create a file in the default temp 
space.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

2016-10-10 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2577
  
Thanks for the pr @zentol. Looks good to me! A couple of things I noticed: 

- The prefix of the interfaces (`Access*`) could be a bit confusing. What 
about these?
  - interface: `ExecutionGraph`
  - runtime:`RuntimeExecutionGraph`
  - archive: `ArchivedExecutionGraph` 

  That said, you probably chose to not rename the existing `ExecutionGraph` 
class to minimize changes.

- How about having an interface `Archiveable` 
containing the `T archive()` method? This interface could be implemented by all 
runtime classes which are archievable.

- You chose not to port the `ExecutionConfig` or `ExecutionConfigSummary` 
to your changes while rebasing?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

2016-10-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562177#comment-15562177
 ] 

ASF GitHub Bot commented on FLINK-4720:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2577
  
Thanks for the pr @zentol. Looks good to me! A couple of things I noticed: 

- The prefix of the interfaces (`Access*`) could be a bit confusing. What 
about these?
  - interface: `ExecutionGraph`
  - runtime:`RuntimeExecutionGraph`
  - archive: `ArchivedExecutionGraph` 

  That said, you probably chose to not rename the existing `ExecutionGraph` 
class to minimize changes.

- How about having an interface `Archiveable` 
containing the `T archive()` method? This interface could be implemented by all 
runtime classes which are archievable.

- You chose not to port the `ExecutionConfig` or `ExecutionConfigSummary` 
to your changes while rebasing?




> Implement an archived version of the execution graph
> 
>
> Key: FLINK-4720
> URL: https://issues.apache.org/jira/browse/FLINK-4720
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Webfrontend
>Affects Versions: 1.1.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the 
> JobManager from the WebInterface, we require an archived version of the 
> ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4788) State backend class cannot be loaded, because fully qualified name converted to lower-case

2016-10-10 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562134#comment-15562134
 ] 

Stephan Ewen commented on FLINK-4788:
-

Will look into this...

> State backend class cannot be loaded, because fully qualified name converted 
> to lower-case
> --
>
> Key: FLINK-4788
> URL: https://issues.apache.org/jira/browse/FLINK-4788
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2
> Environment: Debian Linux and Mac OS X
>Reporter: Florian Koenig
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> The name of the state backend in the configuration file ist converted to 
> lower-case (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L587).
> Therefore, the class cannot be found, and the state backend (e.g., RocksDB) 
> is not loaded properly (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L603).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4788) State backend class cannot be loaded, because fully qualified name converted to lower-case

2016-10-10 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen reassigned FLINK-4788:
---

Assignee: Stephan Ewen

> State backend class cannot be loaded, because fully qualified name converted 
> to lower-case
> --
>
> Key: FLINK-4788
> URL: https://issues.apache.org/jira/browse/FLINK-4788
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.2
> Environment: Debian Linux and Mac OS X
>Reporter: Florian Koenig
>Assignee: Stephan Ewen
>Priority: Blocker
> Fix For: 1.2.0, 1.1.3
>
>
> The name of the state backend in the configuration file ist converted to 
> lower-case (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L587).
> Therefore, the class cannot be found, and the state backend (e.g., RocksDB) 
> is not loaded properly (see 
> https://github.com/eBay/Flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L603).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >