[jira] [Created] (FLINK-7558) Improve SQL ValidationException message.

2017-08-29 Thread sunjincheng (JIRA)
sunjincheng created FLINK-7558:
--

 Summary: Improve SQL ValidationException message.
 Key: FLINK-7558
 URL: https://issues.apache.org/jira/browse/FLINK-7558
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Reporter: sunjincheng
Assignee: sunjincheng


org.apache.flink.table.api.ValidationException: SQL validation failed. Operand 
types of could not be inferred. at 
org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:91)
 at org.apache.flink.table.api.TableEnvironment.sql(TableEnvironment.scala:513) 
at 
com.alibaba.blink.scala.tool.util.SqlJobAdapter.dealInserts(SqlJobAdapter.java:292)
 at 
com.alibaba.blink.scala.tool.util.JobBuildHelper.buildSqlJob(JobBuildHelper.java:80)
 at com.alibaba.blink.scala.tool.JobLauncher.main(JobLauncher.java:138) Caused 
by: org.apache.flink.table.api.ValidationException: Operand types of could not 
be inferred. at 
org.apache.flink.table.functions.utils.ScalarSqlFunction$$anon$2$$anonfun$2.apply(ScalarSqlFunction.scala:110)
 at 
org.apache.flink.table.functions.utils.ScalarSqlFunction$$anon$2$$anonfun$2.apply(ScalarSqlFunction.scala:110)
 at scala.Option.getOrElse(Option.scala:121) at 
org.apache.flink.table.functions.utils.ScalarSqlFunction$$anon$2.inferOperandTypes(ScalarSqlFunction.scala:110)
 at 
org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1769)
 at 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7557) Fix typo for s3a config in AWS deployment documentation

2017-08-29 Thread Wei-Che Wei (JIRA)
Wei-Che Wei created FLINK-7557:
--

 Summary: Fix typo for s3a config in AWS deployment documentation
 Key: FLINK-7557
 URL: https://issues.apache.org/jira/browse/FLINK-7557
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Wei-Che Wei
Assignee: Wei-Che Wei


The property name {{fs.s3.buffer.dir}} for s3a in {{core-site.xml}} should be 
{{fs.s3a.buffer.dir}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


RE: Sync Flink

2017-08-29 Thread ziv
Hi Aljoscha,

Since you answered the question you definitely can answer the bonus. I’ll be 
grateful if you explain why using watermarks as you showed in notion of time 
cant help here.

Thanks.



From: Aljoscha Krettek-2 [via Apache Flink Mailing List archive.] 
[mailto:ml+s1008284n19405...@n3.nabble.com]
Sent: 28 August 2017 12:36
To: Meri Ziv
Subject: Re: Sync Flink

Hi,

This is not possible out-of-box, but you could use a ProcessFunction (or rather 
CoProcessFunction) to buffer elements and set a timer so that you only emit 
when the watermarks advances on both inputs.

Best,
Aljoscha

> On 28. Aug 2017, at 08:10, ziv <[hidden 
> email]> wrote:
>
> People,
>
> I have two sources:
>
> 1) DataStream int1 = env.addSource(new intWithDelat1()): generates
> series of integers in streaming of 1 sec delay between elemets.
>
> 2) DataStream long3 = env.addSource(new longWithDelay3()): generates
> series of longs in streaming of 3 sec delay.
>
> I want to:
>
> int1.connect(long3).flatMap(new printElemnts());
>
> I want this transformation to be synchronized so that element from source 1
> will be read only when the equivalent element from source 2 comes:
> 1
> 1
> 2
> 2
> 3
> 3
> …
>
> And not:
> 1
> 1
> 2
> 3
> 2
> 4
> 5
> …
>
> any way to do that?
>
> (advanced: why things not similar to the example in  'notions of time'
>    at 11:20?)
>
> Thanks
>
>
>
> --
> View this message in context: 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sync-Flink-tp19403.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at 
> Nabble.com.



If you reply to this email, your message will be added to the discussion below:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sync-Flink-tp19403p19405.html
To start a new topic under Apache Flink Mailing List archive., email 
ml+s1008284n1...@n3.nabble.com
To unsubscribe from Sync Flink, click 
here.
NAML

The information in this e-mail transmission contains proprietary and business 
sensitive information.  Unauthorized interception of this e-mail may constitute 
a violation of law. If you are not the intended recipient, you are hereby 
notified that any review, dissemination, distribution or duplication of this 
communication is strictly prohibited. You are also asked to contact the sender 
by reply email and immediately destroy all copies of the original message.




--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sync-Flink-tp19403p19408.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


[jira] [Created] (FLINK-7556) Fix fetch size configurable in JDBCInputFormat for MySQL Driver

2017-08-29 Thread Nycholas de Oliveira e Oliveira (JIRA)
Nycholas de Oliveira e Oliveira created FLINK-7556:
--

 Summary: Fix fetch size configurable in JDBCInputFormat for MySQL 
Driver
 Key: FLINK-7556
 URL: https://issues.apache.org/jira/browse/FLINK-7556
 Project: Flink
  Issue Type: Bug
  Components: Batch Connectors and Input/Output Formats
Affects Versions: 1.3.2
Reporter: Nycholas de Oliveira e Oliveira
Priority: Trivial


According to the MySQL documentation[1], it follows:

* ResultSet

??By default, ResultSets are completely retrieved and stored in memory. In most 
cases this is the most efficient way to operate and, due to the design of the 
MySQL network protocol, is easier to implement. If you are working with 
ResultSets that have a large number of rows or large values and cannot allocate 
heap space in your JVM for the memory required, you can tell the driver to 
stream the results back one row at a time.

To enable this functionality, create a Statement instance in the following 
manner:??

{code:java}
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
  java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
{code}

??The combination of a forward-only, read-only result set, with a fetch size of 
Integer.MIN_VALUE serves as a signal to the driver to stream result sets 
row-by-row. After this, any result sets created with the statement will be 
retrieved row-by-row.
??

Allow the *Integer.MIN_VALUE* to be accepted as a parameter for _setFetchSize_.


[1] - 
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7554) Add a testing RuntimeContext to test utilities

2017-08-29 Thread Timo Walther (JIRA)
Timo Walther created FLINK-7554:
---

 Summary: Add a testing RuntimeContext to test utilities
 Key: FLINK-7554
 URL: https://issues.apache.org/jira/browse/FLINK-7554
 Project: Flink
  Issue Type: New Feature
  Components: Tests
Reporter: Timo Walther


When unit testing user-defined functions it would be useful to have an official 
testing {{RuntimeContext}} that uses Java collections for storing state, 
metrics, etc.

After executing the business logic, the user could then verify how the state of 
the UDF changed or which metrics have been collected.

This issue includes documentation for the "Testing" section.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7553) Use new SinkFunction interface in FlinkKafkaProducer010

2017-08-29 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-7553:
---

 Summary: Use new SinkFunction interface in FlinkKafkaProducer010
 Key: FLINK-7553
 URL: https://issues.apache.org/jira/browse/FLINK-7553
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek
 Fix For: 1.4.0


This will allow us to get rid of the hybrid {{SinkFunction}}/{{StreamOperator}} 
nature of the Kafka 0.10.x sink.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7552) Extend SinkFunction interface with SinkContext

2017-08-29 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-7552:
---

 Summary: Extend SinkFunction interface with SinkContext
 Key: FLINK-7552
 URL: https://issues.apache.org/jira/browse/FLINK-7552
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek
 Fix For: 1.4.0


Now that we require Java 8 we can extend the {{SinkFunction}} interface without 
breaking backwards compatibility. I'm proposing this:

{code}
/**
 * Interface for implementing user defined sink functionality.
 *
 * @param  Input type parameter.
 */
@Public
public interface SinkFunction extends Function, Serializable {

/**
 * Function for standard sink behaviour. This function is called for 
every record.
 *
 * @param value The input record.
 * @throws Exception
 * @deprecated Use {@link #invoke(SinkContext, Object)}.
 */
@Deprecated
default void invoke(IN value) throws Exception {
}

/**
 * Writes the given value to the sink. This function is called for 
every record.
 *
 * @param context Additional context about the input record.
 * @param value The input record.
 * @throws Exception
 */
default void invoke(SinkContext context, IN value) throws Exception {
invoke(value);
}

/**
 * Context that {@link SinkFunction SinkFunctions } can use for getting 
additional data about
 * an input record.
 *
 * @param  The type of elements accepted by the sink.
 */
@Public // Interface might be extended in the future with additional 
methods.
interface SinkContext {

/**
 * Returns the timestamp of the current input record.
 */
long timestamp();
}
}
{code}

For now, this only allows access to the element timestamp. This would allow us 
to fix the abomination that is {{FlinkKafkaProducer010}}, which is a hybrid 
{{SinkFunction}}/{{StreamOperator}} only because it needs access to timestamps.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7551) Add VERSION to the REST urls.

2017-08-29 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-7551:
-

 Summary: Add VERSION to the REST urls. 
 Key: FLINK-7551
 URL: https://issues.apache.org/jira/browse/FLINK-7551
 Project: Flink
  Issue Type: Improvement
  Components: REST
Affects Versions: 1.4.0
Reporter: Kostas Kloudas
 Fix For: 1.4.0


This is to guarantee that we can update the REST API without breaking existing 
third-party clients.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7550) Give names to REST client/server for clearer logging.

2017-08-29 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-7550:
-

 Summary: Give names to REST client/server for clearer logging.
 Key: FLINK-7550
 URL: https://issues.apache.org/jira/browse/FLINK-7550
 Project: Flink
  Issue Type: Improvement
  Components: REST
Affects Versions: 1.4.0
Reporter: Kostas Kloudas
 Fix For: 1.4.0


This issue proposes to give names to the entities composing a REST-ful service 
and use these names when logging messages. This will help debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7549) CEP - Pattern not discovered if source streaming is very fast

2017-08-29 Thread Paolo Rendano (JIRA)
Paolo Rendano created FLINK-7549:


 Summary: CEP - Pattern not discovered if source streaming is very 
fast
 Key: FLINK-7549
 URL: https://issues.apache.org/jira/browse/FLINK-7549
 Project: Flink
  Issue Type: Bug
  Components: CEP
Affects Versions: 1.3.2, 1.3.1
Reporter: Paolo Rendano


Hi all,
I'm doing some stress test on my pattern using JMeter to populate source data 
on a rabbitmq queue. This queue contains status generated by different devices 
. In my test case I set to loop on a base of 1000 cycles, each one sending 
respectively the first and the second status that generate the event using 
flink CEP (status keyed by device). 
In my early tests I launched that but I noticed that I get only partial results 
in output (70/80% of the expected ones). Introducing a delay in jmeter plan 
between the sending of the two status solved the problem. The minimum delay (of 
course this is on my local machine, on other machines may vary) that make 
things work is 20/25 ms.

My code is structured this way (the following is a semplification):

final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.getConfig().setAutoWatermarkInterval(100L);

// source definition
DataStream dataStreamSource =
env.addSource(new 
MYRMQAutoboundQueueSource<>(connectionConfig,
conf.getSourceExchange(),
conf.getSourceRoutingKey(),
conf.getSourceQueueName(),
true,
new MyMessageWrapperSchema()))
.assignTimestampsAndWatermarks(new 
BoundedOutOfOrdernessTimestampExtractor(Time.minutes(1)) {
private static final long serialVersionUID = 
-1L;
@Override
public long extractTimestamp(MyMessageWrapper 
element) {
if 
(element.getData().get("stateTimestamp")==null) {
throw new RuntimeException("Status 
Timestamp is null during time ordering for device [" +  
element.getData().get("deviceCode") + "]");
}
return 
FlinkUtils.getTimeFromJsonTimestamp(element.getData().get("stateTimestamp")).getTime();
}
})
.name("MyIncomingStatus");

// PATTERN  DEFINITION
Pattern myPattern = Pattern
.begin("start")
.subtype(MyMessageWrapper.class)
.where(whereEquals("st", "none"))
.next("end")
.subtype(MyMessageWrapper.class)
.where(whereEquals("st","started"))
.within(Time.minutes(3));

// CEP DEFINITION
PatternStream< MyMessageWrapper > myPatternStream = 
CEP.pattern(dataStreamSource.keyBy(keySelector), myPattern);

DataStream> outputStream = 
myPatternStream.flatSelect(patternFlatTimeoutFunction, 
patternFlatSelectFunction);

// SINK DEFINITION
outputStream.addSink(new MYRMQExchangeSink<>(connectionConfig, outputExchange, 
new MyMessageWrapperSchema())).name("MyGeneratedEvent");

digging and logging messages received by flink in "extractTimestamp", what 
happens is that with that so high rate of messages, source may receive messages 
with the same timestamp but with different deviceCode. 
Any idea?

Thanks, regards
Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: "Unable to find registrar for hdfs" on Flink cluster

2017-08-29 Thread Jean-Baptiste Onofré

By the way, this kind of question should go on the user mailing list IMHO.

Thanks
Regards
JB

On 08/29/2017 08:59 AM, P. Ramanjaneya Reddy wrote:

Hi All,

build jar file from the beam quickstart. while run the jar on Flinkcluster
got below error.?

anybody got this error?
Could you please help how to resolve this?

root1@master:~/NAI/Tools/flink-1.3.0$ *bin/flink run -c
org.apache.beam.examples.WordCount
/home/root1/NAI/Tools/word-count-beam/target/word-count-beam-bundled-0.1.jar
--runner=FlinkRunner
--filesToStage=/home/root1/NAI/Tools/word-count-beam/target/word-count-beam-bundled-0.1.jar
--inputFile=hdfs://master:9000/test/wordcount_input.txt
  --output=hdfs://master:9000/test/wordcount_output919*


This is the output I get:

Caused by: java.lang.IllegalStateException: Unable to find registrar for
hdfs
at
org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:447)
at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:517)
at
org.apache.beam.sdk.io.FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:204)
at org.apache.beam.sdk.io.TextIO$Write.to(TextIO.java:296)
at org.apache.beam.examples.WordCount.main(WordCount.java:182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
... 13 more


Thanks & Regards,
Ramanji.



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


"Unable to find registrar for hdfs" on Flink cluster

2017-08-29 Thread P. Ramanjaneya Reddy
Hi All,

build jar file from the beam quickstart. while run the jar on Flinkcluster
got below error.?

anybody got this error?
Could you please help how to resolve this?

root1@master:~/NAI/Tools/flink-1.3.0$ *bin/flink run -c
org.apache.beam.examples.WordCount
/home/root1/NAI/Tools/word-count-beam/target/word-count-beam-bundled-0.1.jar
--runner=FlinkRunner
--filesToStage=/home/root1/NAI/Tools/word-count-beam/target/word-count-beam-bundled-0.1.jar
--inputFile=hdfs://master:9000/test/wordcount_input.txt
 --output=hdfs://master:9000/test/wordcount_output919*


This is the output I get:

Caused by: java.lang.IllegalStateException: Unable to find registrar for
hdfs
at
org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:447)
at org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:517)
at
org.apache.beam.sdk.io.FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:204)
at org.apache.beam.sdk.io.TextIO$Write.to(TextIO.java:296)
at org.apache.beam.examples.WordCount.main(WordCount.java:182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
... 13 more


Thanks & Regards,
Ramanji.