[jira] [Created] (FLINK-8358) Hostname used by DataDog metric reporter is not configurable

2018-01-03 Thread Elias Levy (JIRA)
Elias Levy created FLINK-8358:
-

 Summary: Hostname used by DataDog metric reporter is not 
configurable
 Key: FLINK-8358
 URL: https://issues.apache.org/jira/browse/FLINK-8358
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.4.0
Reporter: Elias Levy


The hostname used by the DataDog metric reporter to report metrics is not 
configurable.  This can problematic if the hostname that Flink uses is 
different from the hostname used by the system's DataDog agent.  

For instance, in our environment we use Chef, and using the DataDog Chef 
Handler, certain metadata such a host roles is associated with the hostname in 
the DataDog service.  The hostname used to submit this metadata is the name we 
have given the host.  But as Flink picks up the default name given by EC2 to 
the instance, metrics submitted by Flink to DataDog using that hostname are not 
associated with the tags derived from Chef.

In the Job Manager we can avoid this issue by explicitly setting the config 
{{jobmanager.rpc.address}} to the hostname we desire.  I attempted to do the 
name on the Task Manager by setting the {{taskmanager.hostname}} config, but 
DataDog does not seem to pick up that value.

Digging through the code it seem the DD metric reporter get the hostname from 
the {{TaskManagerMetricGroup}} host variable, which seems to be set from 
{{taskManagerLocation.getHostname}}.  That in turn seems to be by calling 
{{this.inetAddress.getCanonicalHostName()}}, which merely perform a reverse 
lookup on the IP address, and then calling {{NetUtils.getHostnameFromFQDN}} on 
the result.  The later is further problematic because it result is a non-fully 
qualified hostname.

More generally, there seems to be a need to specify the hostname of a JM or TM 
node that be reused across Flink components.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8357) enable rolling in default log settings

2018-01-03 Thread Xu Mingmin (JIRA)
Xu Mingmin created FLINK-8357:
-

 Summary: enable rolling in default log settings
 Key: FLINK-8357
 URL: https://issues.apache.org/jira/browse/FLINK-8357
 Project: Flink
  Issue Type: Improvement
  Components: Logging
Reporter: Xu Mingmin


The release packages uses {{org.apache.log4j.FileAppender}} for log4j and 
{{ch.qos.logback.core.FileAppender}} for logback. 

For most cases, if not all, we need to enable rotation in a production cluster, 
and I suppose it's a good idea to make rotation as default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Flink-Yarn-Kerberos integration

2018-01-03 Thread Shuyi Chen
Thanks a lot for the clarification, Eron. That's very helpful. Currently,
we are more concerned about 1) data access, but will get to 2) and 3)
eventually.

I was thinking doing the following:
1) extend the current HadoopModule to use and refresh DTs as suggested on YARN
Application Security docs.
2) I found the current SecurityModule interface might be enough for
supporting other security mechanisms. However, the loading of security
modules are hard-coded, not configuration based. I think we can extend
SecurityUtils to load from configurations. So we can implement our own
security mechanism in our internal repo, and have flink jobs to load it at
runtime.

Please let me know your comments. Thanks a lot.

On Fri, Dec 22, 2017 at 3:05 PM, Eron Wright  wrote:

> I agree that it is reasonable to use Hadoop DTs as you describe.  That
> approach is even recommended in YARN's documentation (see Securing
> Long-lived YARN Services on the YARN Application Security page).   But one
> of the goals of Kerberos integration is to support Kerberized data access
> for connectors other than HDFS, such as Kafka, Cassandra, and
> Elasticsearch.   So your second point makes sense too, suggesting a general
> architecture for managing secrets (DTs, keytabs, certificates, oauth
> tokens, etc.) within the cluster.
>
> There's quite a few aspects to Flink security, including:
> 1. data access (e.g. how a connector authenticates to a data source)
> 2. service authorization and network security (e.g. how a Flink cluster
> protects itself from unauthorized access)
> 3. multi-user support (e.g. multi-user Flink clusters, RBAC)
>
> I mention these aspects to clarify your point about AuthN, which I took to
> be related to (1).   Do tell if I misunderstood.
>
> Eron
>
>
> On Wed, Dec 20, 2017 at 11:21 AM, Shuyi Chen  wrote:
>
> > Hi community,
> >
> > We are working on secure Flink on YARN. The current Flink-Yarn-Kerberos
> > integration will require each container of a job to log in Kerberos via
> > keytab every say, 24 hours, and does not use any Hadoop delegation token
> > mechanism except when localizing the container. As I fixed the current
> > Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
> > features(FLINK-7860), I have some concern regarding the current
> > implementation. It can pose a scalability issue to the KDC, e.g., if YARN
> > cluster is restarted and all 10s of thousands of containers suddenly DDOS
> > KDC.
> >
> > I would like to propose to improve the current Flink-YARN-Kerberos
> > integration by doing something like the following:
> > 1) AppMaster (JobManager) periodically authenticate the KDC, get all
> > required DTs for the job.
> > 2) all other TM or TE containers periodically retrieve new DTs from the
> > AppMaster (either through a secure HDFS folder, or a secure Akka channel)
> >
> > Also, we want to extend Flink to support pluggable AuthN mechanism,
> because
> > we have our own internal AuthN mechanism. We would like add support in
> > Flink to authenticate periodically to our internal AuthN service as well
> > through, e.g., dynamic class loading, and use similar mechanism to
> > distribute the credential from the appMaster to containers.
> >
> > I would like to get comments and feedbacks. I can also write a design doc
> > or create a Flip if needed. Thanks a lot.
> >
> > Shuyi
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>



-- 
"So you have to trust that the dots will somehow connect in your future."


Re: [jira] [Created] (FLINK-8356) JDBCAppendTableSink does not work for Hbase Phoenix Driver

2018-01-03 Thread Flavio Pompermaier
I had a similar problem with batch API...the problem is that you have to
enable autocommit in the connection URL. Thr jdbc connector should better
handle this specific case as well (IMHO).

See https://issues.apache.org/jira/browse/FLINK-7605

On 3 Jan 2018 22:25, "Paul Wu (JIRA)"  wrote:

> Paul Wu created FLINK-8356:
> --
>
>  Summary: JDBCAppendTableSink does not work for Hbase Phoenix
> Driver
>  Key: FLINK-8356
>  URL: https://issues.apache.org/jira/browse/FLINK-8356
>  Project: Flink
>   Issue Type: Bug
>   Components: Table API & SQL
> Affects Versions: 1.4.0
> Reporter: Paul Wu
>
>
> The following code runs without errors, but the data is not inserted into
> the HBase table. However, it does work for MySQL (see the commented out
> code). The Phoenix driver is from https://mvnrepository.com/
> artifact/org.apache.phoenix/phoenix/4.7.0-HBase-1.1
>
> String query = "select CURRENT_DATE SEGMENTSTARTTIME, CURRENT_DATE
> SEGMENTENDTIME, cast (imsi as varchar) imsi, cast(imei as varchar) imei
> from ts ";
>
> Table table = ste.sqlQuery(query);
> JDBCAppendTableSinkBuilder jdbc = JDBCAppendTableSink.builder();
> jdbc.setDrivername("org.apache.phoenix.jdbc.PhoenixDriver");
> jdbc.setDBUrl("jdbc:phoenix:hosts:2181:/hbase-unsecure");
> jdbc.setQuery("upsert INTO GEO_ANALYTICS_STREAMING_DATA
> (SEGMENTSTARTTIME,SEGMENTENDTIME, imsi, imei) values (?,?,?, ?)");
> // JDBCAppendTableSinkBuilder jdbc = JDBCAppendTableSink.builder();
> //jdbc.setDrivername("com.mysql.jdbc.Driver");
> //jdbc.setDBUrl("jdbc:mysql://localhost/test");
> //jdbc.setUsername("root").setPassword("");
> //jdbc.setQuery("insert INTO GEO_ANALYTICS_STREAMING_DATA
> (SEGMENTSTARTTIME,SEGMENTENDTIME, imsi, imei) values (?,?,?, ?)");
> //jdbc.setBatchSize(1);
> jdbc.setParameterTypes(Types.SQL_DATE, Types.SQL_DATE,
> Types.STRING, Types.STRING);
> JDBCAppendTableSink sink = jdbc.build();
> table.writeToSink(sink);
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.4.14#64029)
>


[jira] [Created] (FLINK-8356) JDBCAppendTableSink does not work for Hbase Phoenix Driver

2018-01-03 Thread Paul Wu (JIRA)
Paul Wu created FLINK-8356:
--

 Summary: JDBCAppendTableSink does not work for Hbase Phoenix 
Driver 
 Key: FLINK-8356
 URL: https://issues.apache.org/jira/browse/FLINK-8356
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: Paul Wu


The following code runs without errors, but the data is not inserted into the 
HBase table. However, it does work for MySQL (see the commented out code). The 
Phoenix driver is from 
https://mvnrepository.com/artifact/org.apache.phoenix/phoenix/4.7.0-HBase-1.1

String query = "select CURRENT_DATE SEGMENTSTARTTIME, CURRENT_DATE 
SEGMENTENDTIME, cast (imsi as varchar) imsi, cast(imei as varchar) imei from ts 
";

Table table = ste.sqlQuery(query);
JDBCAppendTableSinkBuilder jdbc = JDBCAppendTableSink.builder();
jdbc.setDrivername("org.apache.phoenix.jdbc.PhoenixDriver");
jdbc.setDBUrl("jdbc:phoenix:hosts:2181:/hbase-unsecure");
jdbc.setQuery("upsert INTO GEO_ANALYTICS_STREAMING_DATA 
(SEGMENTSTARTTIME,SEGMENTENDTIME, imsi, imei) values (?,?,?, ?)");
// JDBCAppendTableSinkBuilder jdbc = JDBCAppendTableSink.builder();
//jdbc.setDrivername("com.mysql.jdbc.Driver");
//jdbc.setDBUrl("jdbc:mysql://localhost/test");
//jdbc.setUsername("root").setPassword("");
//jdbc.setQuery("insert INTO GEO_ANALYTICS_STREAMING_DATA 
(SEGMENTSTARTTIME,SEGMENTENDTIME, imsi, imei) values (?,?,?, ?)");
//jdbc.setBatchSize(1);
jdbc.setParameterTypes(Types.SQL_DATE, Types.SQL_DATE, Types.STRING, 
Types.STRING);
JDBCAppendTableSink sink = jdbc.build();
table.writeToSink(sink);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Invalid lambda deserialization

2018-01-03 Thread Till Rohrmann
Hi Amit,

could this be related [1]? How do you build your job?

[1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=439889

Cheers,
Till

On Wed, Jan 3, 2018 at 2:55 PM, Timo Walther  wrote:

> Hi Amit,
>
> which of the two lambdas caused the error? I guess it was the mapper after
> the parquet input, right? In both cases this should not happen. Maybe you
> can open an issue with a small reproducible code example?
>
> Thanks.
>
> Regards,
> Timo
>
>
> Am 1/3/18 um 12:15 PM schrieb Amit Jain:
>
> Hi Timo,
>>
>> Thanks a lot! Quick re-look over the code helped me to detect used
>> lambdas.
>> I was using lambdas in two cases which are following.
>>
>> DataSet newMainDataSet = mainDataSet
>>
>>  .fullOuterJoin(latestDeltaDataSet, JoinHint.REPARTITION_HASH_SECOND)
>>  .where(keySelector).equalTo(keySelector)
>>
>> *.with((first, second) -> first != null && second != null ? second
>> : (first != null ? first : second))*.filter(filterFunction)
>>  .returns(GenericRecord.class);
>>
>> DataSet mainDataSet =
>>
>>  mergeTableSecond.readParquet(mainPath, avroSchema, env)
>>  .withParameters(parameters)
>> *.map(**t -> t.f1*
>> *)*.returns(GenericRecord.class);
>>
>>
>>
>> On Wed, Jan 3, 2018 at 4:18 PM, Timo Walther  wrote:
>>
>> Hi Amit,
>>>
>>> are you using lambdas as parameters of a Flink Function or in a member
>>> variable? If yes, can you share an lambda example that fails?
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>> Am 1/3/18 um 11:41 AM schrieb Amit Jain:
>>>
>>> Hi,

 I'm writing a job to merge old data with changelogs using DataSet API
 where
 I'm reading changelog using TextInputFormat and old data using
 HadoopInputFormat.

 I can see, job manager has successfully deployed the program flow to
 worker
 nodes. However, workers are immediately going to failed state because of
 *Caused by: java.lang.IllegalArgumentException: Invalid lambda
 deserialization*


 Complete stack trace
 java.lang.RuntimeException: The initialization of the DataSource's
 outputs
 caused an error: Could not read the user code wrapper: unexpected
 exception
 type
 at
 org.apache.flink.runtime.operators.DataSourceTask.invoke(
 DataSourceTask.java:94)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
 at java.lang.Thread.run(Thread.java:748)
 Caused by:
 org.apache.flink.runtime.operators.util.CorruptConfigurationException:
 Could not read the user code wrapper: unexpected exception type
 at
 org.apache.flink.runtime.operators.util.TaskConfig.getStubWr
 apper(TaskConfig.java:290)
 at
 org.apache.flink.runtime.operators.BatchTask.instantiateUser
 Code(BatchTask.java:1432)
 at
 org.apache.flink.runtime.operators.chaining.ChainedMapDriver
 .setup(ChainedMapDriver.java:39)
 at
 org.apache.flink.runtime.operators.chaining.ChainedDriver.
 setup(ChainedDriver.java:90)
 at
 org.apache.flink.runtime.operators.BatchTask.initOutputs(
 BatchTask.java:1299)
 at
 org.apache.flink.runtime.operators.DataSourceTask.initOutput
 s(DataSourceTask.java:287)
 at
 org.apache.flink.runtime.operators.DataSourceTask.invoke(
 DataSourceTask.java:91)
 ... 2 more
 Caused by: java.io.IOException: unexpected exception type
 at java.io.ObjectStreamClass.throwMiscException(ObjectStreamCla
 ss.java:1682)
 at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClas
 s.java:1254)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
 am.java:2078)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
 m.java:2287)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.
 java:2211)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
 am.java:2069)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:433)
 at
 org.apache.flink.util.InstantiationUtil.deserializeObject(In
 stantiationUtil.java:290)
 at
 org.apache.flink.util.InstantiationUtil.readObjectFromConfig
 (InstantiationUtil.java:248)
 at
 org.apache.flink.runtime.operators.util.TaskConfig.getStubWr
 apper(TaskConfig.java:288)
 ... 8 more
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
 ssorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
 thodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.lang.invoke.SerializedLambda.readResolve(SerializedLamb
 da.java:230)
 at 

Re: Need to restart flink job on yarn as supervisord does

2018-01-03 Thread Till Rohrmann
Hi Shivam,

could you elaborate a little bit on the OutOfMemory issue you're observing?
Maybe you could provide the logs.

Cheers,
Till

On Tue, Jan 2, 2018 at 2:12 PM, Shivam Sharma <28shivamsha...@gmail.com>
wrote:

> Hi,
>
> I am using below restart strategy
>
> // Retry always
> env.setRestartStrategy(RestartStrategies.failureRateRestart(
> Integer.MAX_VALUE, // max failures per unit
> Time.of(20, TimeUnit.MINUTES), //time interval for measuring failure
> rate
> Time.of(10, TimeUnit.MINUTES) // delay
> ))
>
> But I am facing OutOfMemory issue.
>
> On Wed, Dec 27, 2017 at 1:23 PM, Ufuk Celebi  wrote:
>
> > Hey Shivam,
> >
> > check this out:
> > https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/restart_
> > strategies.html
> >
> > Does it answer your questions?
> >
> > – Ufuk
> >
> > On Tue, Dec 26, 2017 at 9:30 AM, Shivam Sharma <28shivamsha...@gmail.com
> >
> > wrote:
> > > I am submitting my Flink Job on Yarn(Amazon EMR)
> > >
> > > On Tue, Dec 26, 2017 at 1:59 PM, Shivam Sharma <
> 28shivamsha...@gmail.com
> > >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> My Flink job fails due to external activity like when Kafka goes
> down. I
> > >> want to restart my Flink job after certain time interval.
> > >>
> > >> *I need to know best practices in this. How to restart Flink job
> > >> automatically.*
> > >>
> > >> Thanks
> > >>
> > >> --
> > >> Shivam Sharma
> > >> Data Engineer @ Goibibo
> > >> Indian Institute Of Information Technology, Design and Manufacturing
> > >> Jabalpur
> > >> Mobile No- (+91) 8882114744
> > >> Email:- 28shivamsha...@gmail.com
> > >> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
> > >> *
> > >>
> > >
> > >
> > >
> > > --
> > > Shivam Sharma
> > > Data Engineer @ Goibibo
> > > Indian Institute Of Information Technology, Design and Manufacturing
> > > Jabalpur
> > > Mobile No- (+91) 8882114744
> > > Email:- 28shivamsha...@gmail.com
> > > LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
> > > *
> >
>
>
>
> --
> Shivam Sharma
> Data Engineer @ Goibibo
> Indian Institute Of Information Technology, Design and Manufacturing
> Jabalpur
> Mobile No- (+91) 8882114744
> Email:- 28shivamsha...@gmail.com
> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
> *
>


[jira] [Created] (FLINK-8355) DataSet Should not union a NULL row for AGG without GROUP BY clause.

2018-01-03 Thread sunjincheng (JIRA)
sunjincheng created FLINK-8355:
--

 Summary: DataSet Should not union a NULL row for AGG without GROUP 
BY clause.
 Key: FLINK-8355
 URL: https://issues.apache.org/jira/browse/FLINK-8355
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.5.0
Reporter: sunjincheng


Currently {{DataSetAggregateWithNullValuesRule}} will UINON a NULL row for  non 
grouped aggregate query. when {{CountAggFunction}} support 
{{COUNT(*)}}(FLINK-8325).  the result will incorrect.
for example, if Tabble {{T1}} has 3 records. when we run the follow SQL in 
DataSet: 
{code}
SELECT COUNT(*) as cnt from Tab // cnt = 4(incorrect).
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Invalid lambda deserialization

2018-01-03 Thread Timo Walther

Hi Amit,

which of the two lambdas caused the error? I guess it was the mapper 
after the parquet input, right? In both cases this should not happen. 
Maybe you can open an issue with a small reproducible code example?


Thanks.

Regards,
Timo


Am 1/3/18 um 12:15 PM schrieb Amit Jain:

Hi Timo,

Thanks a lot! Quick re-look over the code helped me to detect used lambdas.
I was using lambdas in two cases which are following.

DataSet newMainDataSet = mainDataSet

 .fullOuterJoin(latestDeltaDataSet, JoinHint.REPARTITION_HASH_SECOND)
 .where(keySelector).equalTo(keySelector)

*.with((first, second) -> first != null && second != null ? second
: (first != null ? first : second))*.filter(filterFunction)
 .returns(GenericRecord.class);

DataSet mainDataSet =

 mergeTableSecond.readParquet(mainPath, avroSchema, env)
 .withParameters(parameters)
*.map(**t -> t.f1*
*)*.returns(GenericRecord.class);



On Wed, Jan 3, 2018 at 4:18 PM, Timo Walther  wrote:


Hi Amit,

are you using lambdas as parameters of a Flink Function or in a member
variable? If yes, can you share an lambda example that fails?

Regards,
Timo


Am 1/3/18 um 11:41 AM schrieb Amit Jain:


Hi,

I'm writing a job to merge old data with changelogs using DataSet API
where
I'm reading changelog using TextInputFormat and old data using
HadoopInputFormat.

I can see, job manager has successfully deployed the program flow to
worker
nodes. However, workers are immediately going to failed state because of
*Caused by: java.lang.IllegalArgumentException: Invalid lambda
deserialization*


Complete stack trace
java.lang.RuntimeException: The initialization of the DataSource's outputs
caused an error: Could not read the user code wrapper: unexpected
exception
type
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(
DataSourceTask.java:94)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)
Caused by:
org.apache.flink.runtime.operators.util.CorruptConfigurationException:
Could not read the user code wrapper: unexpected exception type
at
org.apache.flink.runtime.operators.util.TaskConfig.getStubWr
apper(TaskConfig.java:290)
at
org.apache.flink.runtime.operators.BatchTask.instantiateUser
Code(BatchTask.java:1432)
at
org.apache.flink.runtime.operators.chaining.ChainedMapDriver
.setup(ChainedMapDriver.java:39)
at
org.apache.flink.runtime.operators.chaining.ChainedDriver.
setup(ChainedDriver.java:90)
at
org.apache.flink.runtime.operators.BatchTask.initOutputs(
BatchTask.java:1299)
at
org.apache.flink.runtime.operators.DataSourceTask.initOutput
s(DataSourceTask.java:287)
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(
DataSourceTask.java:91)
... 2 more
Caused by: java.io.IOException: unexpected exception type
at java.io.ObjectStreamClass.throwMiscException(ObjectStreamCla
ss.java:1682)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClas
s.java:1254)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
am.java:2078)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
m.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
am.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:433)
at
org.apache.flink.util.InstantiationUtil.deserializeObject(In
stantiationUtil.java:290)
at
org.apache.flink.util.InstantiationUtil.readObjectFromConfig
(InstantiationUtil.java:248)
at
org.apache.flink.runtime.operators.util.TaskConfig.getStubWr
apper(TaskConfig.java:288)
... 8 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.lang.invoke.SerializedLambda.readResolve(SerializedLamb
da.java:230)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClas
s.java:1248)
... 18 more
Caused by: java.lang.IllegalArgumentException: Invalid lambda
deserialization
at
org.myorg.quickstart.MergeTableSecond.$deserializeLambda$(Me
rgeTableSecond.java:41)
... 28 more


Running Environment
Flink: 1.3.2
Java: openjdk version "1.8.0_151"

Please help us resolve this issue.


--
Thanks,
Amit






[jira] [Created] (FLINK-8354) Flink Kafka connector ingonre Kafka message headers

2018-01-03 Thread Mohammad Abareghi (JIRA)
Mohammad Abareghi created FLINK-8354:


 Summary: Flink Kafka connector ingonre Kafka message  headers 
 Key: FLINK-8354
 URL: https://issues.apache.org/jira/browse/FLINK-8354
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
 Environment: Kafka 0.11.0.0
Flink 1.4.0
flink-connector-kafka-0.11_2.11 
Reporter: Mohammad Abareghi


Kafka has introduced notion of Header for messages in version 0.11.0.0  
https://issues.apache.org/jira/browse/KAFKA-4208.

But flink-connector-kafka-0.11_2.11 which supports kafka 0.11.0.0 ignores 
headers when consuming kafka messages. 

It would be useful in some scenarios, such as distributed log tracing, to 
support message headers to FlinkKafkaConsumer011 and FlinkKafkaProducer011. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Docker images

2018-01-03 Thread Patrick Lucas
Hi Felix,

As a brief note, these images are not "official" in that they are owned by
the Apache Flink project or the ASF, but they are maintained by the Flink
community.

The sources currently live in a separate repo to make them much easier to
version and maintain independent of Flink itself. This is a very common
practice and applies to most other popular "official" images hosted on
Docker Hub.

--
Patrick Lucas

On Wed, Dec 20, 2017 at 4:20 AM, Felix Cheung 
wrote:

> Hi!
>
> Is there a reason the official docker images
> https://hub.docker.com/r/_/flink/
>
> Has sources in a different repo
> https://github.com/docker-flink/docker-flink
> ?
>


Re: Invalid lambda deserialization

2018-01-03 Thread Amit Jain
Hi Timo,

Thanks a lot! Quick re-look over the code helped me to detect used lambdas.
I was using lambdas in two cases which are following.

DataSet newMainDataSet = mainDataSet

.fullOuterJoin(latestDeltaDataSet, JoinHint.REPARTITION_HASH_SECOND)
.where(keySelector).equalTo(keySelector)

*.with((first, second) -> first != null && second != null ? second
: (first != null ? first : second))*.filter(filterFunction)
.returns(GenericRecord.class);

DataSet mainDataSet =

mergeTableSecond.readParquet(mainPath, avroSchema, env)
.withParameters(parameters)
*.map(**t -> t.f1*
*)*.returns(GenericRecord.class);



On Wed, Jan 3, 2018 at 4:18 PM, Timo Walther  wrote:

> Hi Amit,
>
> are you using lambdas as parameters of a Flink Function or in a member
> variable? If yes, can you share an lambda example that fails?
>
> Regards,
> Timo
>
>
> Am 1/3/18 um 11:41 AM schrieb Amit Jain:
>
>> Hi,
>>
>> I'm writing a job to merge old data with changelogs using DataSet API
>> where
>> I'm reading changelog using TextInputFormat and old data using
>> HadoopInputFormat.
>>
>> I can see, job manager has successfully deployed the program flow to
>> worker
>> nodes. However, workers are immediately going to failed state because of
>> *Caused by: java.lang.IllegalArgumentException: Invalid lambda
>> deserialization*
>>
>>
>> Complete stack trace
>> java.lang.RuntimeException: The initialization of the DataSource's outputs
>> caused an error: Could not read the user code wrapper: unexpected
>> exception
>> type
>> at
>> org.apache.flink.runtime.operators.DataSourceTask.invoke(
>> DataSourceTask.java:94)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by:
>> org.apache.flink.runtime.operators.util.CorruptConfigurationException:
>> Could not read the user code wrapper: unexpected exception type
>> at
>> org.apache.flink.runtime.operators.util.TaskConfig.getStubWr
>> apper(TaskConfig.java:290)
>> at
>> org.apache.flink.runtime.operators.BatchTask.instantiateUser
>> Code(BatchTask.java:1432)
>> at
>> org.apache.flink.runtime.operators.chaining.ChainedMapDriver
>> .setup(ChainedMapDriver.java:39)
>> at
>> org.apache.flink.runtime.operators.chaining.ChainedDriver.
>> setup(ChainedDriver.java:90)
>> at
>> org.apache.flink.runtime.operators.BatchTask.initOutputs(
>> BatchTask.java:1299)
>> at
>> org.apache.flink.runtime.operators.DataSourceTask.initOutput
>> s(DataSourceTask.java:287)
>> at
>> org.apache.flink.runtime.operators.DataSourceTask.invoke(
>> DataSourceTask.java:91)
>> ... 2 more
>> Caused by: java.io.IOException: unexpected exception type
>> at java.io.ObjectStreamClass.throwMiscException(ObjectStreamCla
>> ss.java:1682)
>> at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClas
>> s.java:1254)
>> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>> am.java:2078)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>> m.java:2287)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>> am.java:2069)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:433)
>> at
>> org.apache.flink.util.InstantiationUtil.deserializeObject(In
>> stantiationUtil.java:290)
>> at
>> org.apache.flink.util.InstantiationUtil.readObjectFromConfig
>> (InstantiationUtil.java:248)
>> at
>> org.apache.flink.runtime.operators.util.TaskConfig.getStubWr
>> apper(TaskConfig.java:288)
>> ... 8 more
>> Caused by: java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at java.lang.invoke.SerializedLambda.readResolve(SerializedLamb
>> da.java:230)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClas
>> s.java:1248)
>> ... 18 more
>> Caused by: java.lang.IllegalArgumentException: Invalid lambda
>> deserialization
>> at
>> org.myorg.quickstart.MergeTableSecond.$deserializeLambda$(Me
>> rgeTableSecond.java:41)
>> ... 28 more
>>
>>
>> Running Environment
>> Flink: 1.3.2
>> Java: openjdk version "1.8.0_151"
>>
>> Please help us resolve this issue.
>>
>>
>> --
>> Thanks,
>> Amit
>>
>>
>


[jira] [Created] (FLINK-8353) Add support for timezones

2018-01-03 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8353:
---

 Summary: Add support for timezones
 Key: FLINK-8353
 URL: https://issues.apache.org/jira/browse/FLINK-8353
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther


This is an umbrella issue for adding support for timezones in the Table & SQL 
API.

Usually companies work with different timezones simultaneously. We could add 
support for the new time classes introduced with Java 8 and enable our scalar 
functions to also work with those (or some custom time class implementations 
like those from Calcite). We need a good design for this to address most of the 
problems users face related to timestamp and timezones.

It is up for discussion how to ship date, time, timestamp instances through the 
cluster.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Invalid lambda deserialization

2018-01-03 Thread Timo Walther

Hi Amit,

are you using lambdas as parameters of a Flink Function or in a member 
variable? If yes, can you share an lambda example that fails?


Regards,
Timo


Am 1/3/18 um 11:41 AM schrieb Amit Jain:

Hi,

I'm writing a job to merge old data with changelogs using DataSet API where
I'm reading changelog using TextInputFormat and old data using
HadoopInputFormat.

I can see, job manager has successfully deployed the program flow to worker
nodes. However, workers are immediately going to failed state because of
*Caused by: java.lang.IllegalArgumentException: Invalid lambda
deserialization*

Complete stack trace
java.lang.RuntimeException: The initialization of the DataSource's outputs
caused an error: Could not read the user code wrapper: unexpected exception
type
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:94)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)
Caused by:
org.apache.flink.runtime.operators.util.CorruptConfigurationException:
Could not read the user code wrapper: unexpected exception type
at
org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:290)
at
org.apache.flink.runtime.operators.BatchTask.instantiateUserCode(BatchTask.java:1432)
at
org.apache.flink.runtime.operators.chaining.ChainedMapDriver.setup(ChainedMapDriver.java:39)
at
org.apache.flink.runtime.operators.chaining.ChainedDriver.setup(ChainedDriver.java:90)
at
org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1299)
at
org.apache.flink.runtime.operators.DataSourceTask.initOutputs(DataSourceTask.java:287)
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:91)
... 2 more
Caused by: java.io.IOException: unexpected exception type
at java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1682)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1254)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2078)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:433)
at
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:290)
at
org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:248)
at
org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)
... 8 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1248)
... 18 more
Caused by: java.lang.IllegalArgumentException: Invalid lambda
deserialization
at
org.myorg.quickstart.MergeTableSecond.$deserializeLambda$(MergeTableSecond.java:41)
... 28 more


Running Environment
Flink: 1.3.2
Java: openjdk version "1.8.0_151"

Please help us resolve this issue.


--
Thanks,
Amit





Invalid lambda deserialization

2018-01-03 Thread Amit Jain
Hi,

I'm writing a job to merge old data with changelogs using DataSet API where
I'm reading changelog using TextInputFormat and old data using
HadoopInputFormat.

I can see, job manager has successfully deployed the program flow to worker
nodes. However, workers are immediately going to failed state because of
*Caused by: java.lang.IllegalArgumentException: Invalid lambda
deserialization*

Complete stack trace
java.lang.RuntimeException: The initialization of the DataSource's outputs
caused an error: Could not read the user code wrapper: unexpected exception
type
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:94)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)
Caused by:
org.apache.flink.runtime.operators.util.CorruptConfigurationException:
Could not read the user code wrapper: unexpected exception type
at
org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:290)
at
org.apache.flink.runtime.operators.BatchTask.instantiateUserCode(BatchTask.java:1432)
at
org.apache.flink.runtime.operators.chaining.ChainedMapDriver.setup(ChainedMapDriver.java:39)
at
org.apache.flink.runtime.operators.chaining.ChainedDriver.setup(ChainedDriver.java:90)
at
org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1299)
at
org.apache.flink.runtime.operators.DataSourceTask.initOutputs(DataSourceTask.java:287)
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:91)
... 2 more
Caused by: java.io.IOException: unexpected exception type
at java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1682)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1254)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2078)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:433)
at
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:290)
at
org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:248)
at
org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288)
... 8 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1248)
... 18 more
Caused by: java.lang.IllegalArgumentException: Invalid lambda
deserialization
at
org.myorg.quickstart.MergeTableSecond.$deserializeLambda$(MergeTableSecond.java:41)
... 28 more


Running Environment
Flink: 1.3.2
Java: openjdk version "1.8.0_151"

Please help us resolve this issue.


--
Thanks,
Amit