[no subject]

2024-06-18 Thread Dat Nguyen Tien



[no subject]

2024-06-18 Thread Dat Nguyen Tien



[no subject]

2024-05-08 Thread cloud young


[no subject]

2024-02-03 Thread Gavin McDonald
Hello to all users, contributors and Committers!

The Travel Assistance Committee (TAC) are pleased to announce that
travel assistance applications for Community over Code EU 2024 are now
open!

We will be supporting Community over Code EU, Bratislava, Slovakia,
June 3th - 5th, 2024.

TAC exists to help those that would like to attend Community over Code
events, but are unable to do so for financial reasons. For more info
on this years applications and qualifying criteria, please visit the
TAC website at < https://tac.apache.org/ >. Applications are already
open on https://tac-apply.apache.org/, so don't delay!

The Apache Travel Assistance Committee will only be accepting
applications from those people that are able to attend the full event.

Important: Applications close on Friday, March 1st, 2024.

Applicants have until the the closing date above to submit their
applications (which should contain as much supporting material as
required to efficiently and accurately process their request), this
will enable TAC to announce successful applications shortly
afterwards.

As usual, TAC expects to deal with a range of applications from a
diverse range of backgrounds; therefore, we encourage (as always)
anyone thinking about sending in an application to do so ASAP.

For those that will need a Visa to enter the Country - we advise you apply
now so that you have enough time in case of interview delays. So do not
wait until you know if you have been accepted or not.

We look forward to greeting many of you in Bratislava, Slovakia in June,
2024!

Kind Regards,

Gavin

(On behalf of the Travel Assistance Committee)


[no subject]

2023-11-26 Thread Jintao Ma



[no subject]

2023-08-15 Thread Dennis Jung
(this is issue from Flink 1.14)

Hello,

I've set up following logic to consume messages from kafka, and produce
them to another kafka broker. For producer, I've configured
`Semantics.EXACTLY_ONCE` to send messages exactly once. (also setup
'StreamExecutionEnvironment::enableCheckpointing' as
'CheckpointingMode.EXACTLY_ONCE')


kafka A -> FlinkKafkaConsumer -> ... -> FlinkKafkaProducer -> kafka B


But though I've just produced only 1 message to 'kafka A', consumer
consumes the same message repeatedly.

When I remove `FlinkKafkaProducer` part and make it 'read only', it does
not happen.

Could someone suggest a way to debug or fix this?

Thank you.


[no subject]

2023-05-23 Thread rania duni
Hello!

I have deployed the flink kubernetes operator 1.4.0 on minikube and I
enabled the autoscaler. However, I  get this error "2023-05-22 14:54:25,494
o.a.f.k.o.a.m.ScalingMetrics [ERROR][default/example] Cannot compute true
processing/output rate without busyTimeMsPerSecond " on logs of the
operator. I don't know how to give access to this metric. Can anyone help?


[no subject]

2023-05-23 Thread rania duni
Hello!

I have deployed the flink kubernetes operator 1.4.0 on minikube and I
enabled the autoscaler. However, I  get this error "2023-05-22 14:54:25,494
o.a.f.k.o.a.m.ScalingMetrics [ERROR][default/example] Cannot compute true
processing/output rate without busyTimeMsPerSecond " on logs of the
operator. I don't know how to give access to this metric. Can anyone help?


[no subject]

2023-05-16 Thread Sharif Khan via user
sharif.k...@selise.ch

-- 









SELISE Group
Zürich: The Circle 37, 8058 Zürich-Airport, 
Switzerland
Munich: Tal 44, 80331 München, Germany
Dubai: Building 3, 3rd 
Floor, Dubai Design District, Dubai, United Arab Emirates
Dhaka: Midas 
Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh
Thimphu: Bhutan 
Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan

Visit us: 
www.selisegroup.com 




-- 




*Important Note: This e-mail and any attachment are confidential and 
may contain trade secrets and may well also be legally privileged or 
otherwise protected from disclosure. If you have received it in error, you 
are on notice of its status. Please notify us immediately by reply e-mail 
and then delete this e-mail and any attachment from your system. If you are 
not the intended recipient please understand that you must not copy this 
e-mail or any attachment or disclose the contents to any other person. 
Thank you for your cooperation.*


[no subject]

2023-05-07 Thread Anuj Jain
Hi Community,


I am trying to use flink-parquet for reading and writing parquet files from
the Flink filesystem connectors.

In File source, I would be decoding parquet files and converting them to
avro records and similarly in file sink i would be encoding avro records to
parquet files.


For collection i am using

BulkFormat bulkFormat =
new
StreamFormatAdapter<>(AvroParquetReaders.forSpecificRecord(recordClazz));
FileSource source = FileSource.forBulkFileFormat(bulkFormat,
path).build();


and for sinking i am using

FileSink sink = FileSink.forBulkFormat(path,
AvroParquetWriters.forSpecificRecord(recordClazz)).build()


Query: The StreamFormatAdapter class is marked @Internal and,
AvroParquetReaders
and AvroParquetWriters classes are marked @Experimental – does it mean that
in future flink releases these classes can be changed in a non-backward
compatible way like plugging of any other 3PP rather than “parquet-avro” or
changing the API structure thus impacting the application code ?

Would it be safe to use the code as specified above ?


Thanks and Regards

Anuj


[no subject]

2022-12-21 Thread deepthi s
 Hello, I am new to even-time processing and need some help.



We have a kafka source with very low qps and multiple topic partitions have
no data for long periods of time. Additionally, data from the source can
come out of order (within bounds) and the application needs to process the
events in order per key.  So we wanted to try and sort the events in the
application.


I am using BoundedOutOfOrdernessWatermarks for generating the watermarks
and using TumblingEventTimeWindows to collect the keyed events and sort
them in the ProcessWindowFunction. I am seeing that the window doesn’t
close at all and based on my understanding it is because there are no
events for some source partitions. All operators have the same parallelism
as source kafka partition count.



Pseudo code for my processing:



SingleOutputStreamOperator myStream =

 env.fromSource(

*setupSource*(),

   WatermarkStrategy.*noWatermarks*(),
   "Kafka Source",

TypeInformation.*of*(RowData.class))
.map(rowData -> convertToMyEvent(rowData))
.assignTimestampsAndWatermarks(WatermarkStrategy
.*forBoundedOutOfOrderness*(
Duration.*ofMinutes*(misAlignmentThresholdMinutes))
.withTimestampAssigner((event, timestamp) -> event.timestamp))
// Key the events by urn which is the key used for the output kafka
topic
.keyBy((event) -> event.urn.toString())
// Set up a tumbling window of misAlignmentThresholdMinutes

.window(TumblingEventTimeWindows.*of*(Time.*of*
(misAlignmentThresholdMinutes, TimeUnit.*MINUTES*)))
.process(new EventTimerOrderProcessFunction())

.sinkTo(setupSink());



Is the understanding correct that the based on the WatermarkStrategy I
have, multiple operators will keep emitting *LONG.MIN_VALUE - threshold* if
no events are read for those partitions, causing the downstream keyBy
operator also to emit *LONG.MIN_VALUE - threshold* watermark (as the min of
all watermarks it sees from its input map operators) and so the window
doesn’t close at all? If yes, what is the right strategy to handle this? Is
there a way to combine EventTimeTrigger with ProcessingTimeoutTrigger?





-- 
Regards,
Deepthi


[no subject]

2022-06-11 Thread chenshu...@foxmail.com
unsubscribe
退订



chenshu...@foxmail.com


[no subject]

2022-02-28 Thread 谭 海棠
退订

获取 Outlook for iOS


[no subject]

2022-02-28 Thread 谭 海棠
退订


[no subject]

2022-01-07 Thread sudhansu jena
Unsubscribe


[no subject]

2021-11-15 Thread Uday Garikipati
Unsubscribe


[no subject]

2021-11-15 Thread xm lian
Unsubscribe


[no subject]

2021-10-12 Thread Andrew Otto
Hello,

I'm trying to use HiveCatalog with Kerberos.  Our Hadoop cluster, our Hive
Metastore, and our Hive Server are kerberized.  I can successfully submit
Flink jobs to Yarn authenticated as my users using a cached ticket, as well
as using a keytab.

However, I can't seem to register a HiveCatalog with my TableEnvironment.
Here's my code:

import org.apache.flink.table.api._
import org.apache.flink.table.catalog.hive.HiveCatalog

val settings = EnvironmentSettings.newInstance().inStreamingMode().build()
val tableEnv = TableEnvironment.create(settings)
val catalog = new HiveCatalog("analytics_hive", "flink_test", "/etc/hive/conf")
tableEnv.registerCatalog("analytics_hive", catalog)


Which causes an exception:
Caused by: java.lang.reflect.InvocationTargetException:
org.apache.hadoop.hive.metastore.api.MetaException: Could not connect to
meta store using any of the URIs provided. Most recent failure:
org.apache.thrift.transport.TTransportException: GSS initiate failed

(Full stacktrace here
.)

The same error happens if I try to submit this job using my cached kerberos
ticket, or with a keytab.
I have also tried wrapping the HiveCatalog in a Hadoop UserGroupInformation
PrivilegedExceptionAction as described here
 and got the
same result (no real idea what I'm doing here, just trying some things.)

Is there something more I have to do to use HiveCatalog with a kerberized
Hive Metastore?  Should Flink support this out of the box?

Thanks!
- Andrew Otto
  SRE, Wikimedia Foundation


[no subject]

2021-09-28 Thread Violeta Milanović
unsubscribe


[no subject]

2021-09-12 Thread chang li



[no subject]

2021-07-13 Thread guo shiguang



[no subject]

2021-07-06 Thread Maciek Bryński
Hi,
I have a very strange bug when using MATCH_RECOGNIZE.

I'm using some joins and unions to create event stream. Sample event stream 
(for one user) looks like this:

uuidcif event_type  v   balance ts
621456e9-389b-409b-aaca-bca99eeb43b30004091386  trx 
4294.38 74.524950   2021-05-01 04:42:57
7b2bc022-b069-41ca-8bbf-e93e3f0e85a70004091386  application 0E-18   
74.524950   2021-05-01 10:29:10
942cd3ce-fb3d-43d3-a69a-aaeeec5ee90e0004091386  application 0E-18   
74.524950   2021-05-01 10:39:02
433ac9bc-d395-457n-986c-19e30e375f2e0004091386  trx 
4294.38 74.524950   2021-05-01 04:42:57

Then I'm using following MATCH_RECOGNIZE definition (trace function will be 
explained later)

CREATE VIEW scenario_1 AS (
SELECT * FROM events
MATCH_RECOGNIZE(
PARTITION BY cif
ORDER BY ts
MEASURES
TRX.v as trx_amount,
TRX.ts as trx_ts,
APP_1.ts as app_1_ts,
APP_2.ts as app_2_ts,
APP_2.balance as app_2_balance
ONE ROW PER MATCH
PATTERN (TRX ANY_EVENT*? APP_1 NOT_LOAN*? APP_2) WITHIN INTERVAL '10' 
DAY
DEFINE
TRX AS trace(TRX.event_type = 'trx' AND TRX.v > 1000,
  'TRX', TRX.uuid, TRX.cif, TRX.event_type, TRX.ts),
ANY_EVENT AS trace(true,
  'ANY_EVENT', TRX.uuid, ANY_EVENT.cif, ANY_EVENT.event_type, 
ANY_EVENT.ts),
APP_1 AS trace(APP_1.event_type = 'application' AND APP_1.ts < TRX.ts + 
INTERVAL '3' DAY,
  'APP_1', TRX.uuid, APP_1.cif, APP_1.event_type, APP_1.ts),
APP_2 AS trace(APP_2.event_type = 'application' AND APP_2.ts > APP_1.ts
   AND APP_2.ts < APP_1.ts + INTERVAL '7' DAY AND APP_2.balance 
< 100,
  'APP_2', TRX.uuid, APP_2.cif, APP_2.event_type, APP_2.ts),
NOT_LOAN AS trace(NOT_LOAN.event_type <> 'loan',
  'NOT_LOAN', TRX.uuid, NOT_LOAN.cif, NOT_LOAN.event_type, 
NOT_LOAN.ts)
))


This scenario could be matched by sample events because:
- TRX is matched by event with ts 2021-05-01 04:42:57
- APP_1 by ts 2021-05-01 10:29:10
- APP_2 by ts 2021-05-01 10:39:02
Unfortunately I'm not getting any data. And it's not watermarks fault.

Trace function has following code and gives me some logs:

public class TraceUDF extends ScalarFunction {

public Boolean eval(Boolean condition, @DataTypeHint(inputGroup = 
InputGroup.ANY) Object ... message) {
log.info((condition ? "Condition true: " : "Condition false: ") + 
Arrays.stream(message).map(Object::toString).collect(Collectors.joining(" ")));
return condition;
}
}

And log from this trace function is following.

2021-07-06 13:09:43,762 INFO TraceUDF [] - 
Condition true: TRX 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx 
2021-05-01T04:42:57
2021-07-06 13:12:28,914 INFO  TraceUDF [] - 
Condition true: ANY_EVENT 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx 
2021-05-01T15:28:34
2021-07-06 13:12:28,915 INFO  TraceUDF [] - 
Condition false: APP_1 621456e9-389b-409b-aaca-bca99eeb43b3 0004091386 trx 
2021-05-01T15:28:34
2021-07-06 13:12:28,915 INFO  TraceUDF [] - 
Condition false: TRX 433ac9bc-d395-457n-986c-19e30e375f2e 0004091386 trx 
2021-05-01T15:28:34

As you can see 2 events are missing.
What can I do ?
I failed with create minimal example of this bug. Any other ideas ?


[no subject]

2021-06-20 Thread 张万新
unsubscribe


[no subject]

2021-05-19 Thread Wenyi Xu



[no subject]

2021-05-19 Thread vtygoss
Hi,


I have below use case


Insert bounded data into dynamic table(upsert-kafka) using Flink 1.12 on yarn, 
but  yarn application is still running when insert job finished, and yarn 
container is not released.


I try to use BatchTableEnvironment, but “Primary key and unique key are not 
supported yet”; i try to use 
StreamingExecutionEnvironment.setRuntimeMode(RuntimeExecutionMode.BATCH), but 
it not works.


Please help to offer some advice. 


Regards




```
[test case code]
val (senv, btenv) = FlinkSession.getOrCreate()
val table = btenv.fromValues(
 Row.ofKind(RowKind.INSERT, "1"),
 Row.ofKind(RowKind.INSERT, "2")).select("f0")

btenv.createTemporaryView("bound", table)
btenv.executeSql(s"create table if not exists test_result(" +
 "id string, PRIMARY KEY(id) NOT ENFORCED) WITH(" +
 
s"'connector'='kafka','topic'='test_result','properties.bootstrap.servers'='${KAFKA_SERVER}',"
 +
 s"'key.format'='json','value.format'='json')")
btenv.executeSql("insert into test_result select f0 from bound")


```

[no subject]

2021-05-12 Thread ronen flink



[no subject]

2020-12-11 Thread Arpith techy



[no subject]

2020-11-18 Thread Denis Nutiu
Hello everyone!

I'm new to Apache Flink and I would like to get some opinions on how I
should deploy my Flink jobs.

Let's say I want to do sentiment analysis for Slack workspaces. I have 10
companies each having 2 slack workspaces.

How should I deploy Flink jobs if I'd like to efficiently utilize flink?

1 sentiment analysis Flink job per slack workspace.
1 sentiment analysis Flink job per company.
1 sentiment analysis Flink job for all workspaces.

My intuition tells me that I should use 1 job per company, having a total
of 10 jobs so they would be easy to manage and restart if a fault occurs.
But I'd like to hear some other opinions.

Thank you!


-- 
Regards,
Denis Nutiu


[no subject]

2020-08-14 Thread Jaswin Shah
Hi,

I have a coProcessFunction which emits data to same side output from 
processElement1 method and on timer method.But, data is not getting emitted to 
sideoutput from onTimer. Is it like to the same sideoutput, we can not emit 
data from onTimer and processElement methods?

Get Outlook for Android


[no subject]

2020-06-29 Thread Georg Heiler
Hi,

I try to use the confluent schema registry in an interactive Flink Scala
shell.

My problem is trying to initialize the serializer from the
ConfluentRegistryAvroDeserializationSchema fails:

```scala
val serializer =
ConfluentRegistryAvroDeserializationSchema.forSpecific[Tweet](classOf[Tweet],
schemaRegistryUrl)
error: type arguments [Tweet] conform to the bounds of none of the
overloaded alternatives of
value forSpecific: [T <: org.apache.avro.specific.SpecificRecord](x$1:
Class[T], x$2: String, x$3:
Int)org.apache.flink.formats.avro.registry.confluent.ConfluentRegistryAvroDeserializationSchema[T]
 [T <: org.apache.avro.specific.SpecificRecord](x$1: Class[T],
x$2: 
String)org.apache.flink.formats.avro.registry.confluent.ConfluentRegistryAvroDeserializationSchema[T]
```

please see
https://stackoverflow.com/questions/62637009/flink-use-confluent-schema-registry-for-avro-serde
for details how the shell was set up and which additional JARs were loaded

Best,
Georg


[no subject]

2020-06-22 Thread 王宇
Hi, all
 some error occurred when I run flink in minicluster,
flink-version:1.11、scala-version:2.12.0.

Error:(33, 41) could not find implicit value for evidence parameter of type
org.apache.flink.api.common.typeinfo.TypeInformation[(Int, String)]
val solutionInput = env.fromElements((1, "1"))
Error:(33, 41) not enough arguments for method fromElements: (implicit
evidence$14: scala.reflect.ClassTag[(Int, String)], implicit evidence$15:
org.apache.flink.api.common.typeinfo.TypeInformation[(Int,
String)])org.apache.flink.api.scala.DataSet[(Int, String)].
Unspecified value parameter evidence$15.
val solutionInput = env.fromElements((1, "1"))
Error:(34, 40) could not find implicit value for evidence parameter of type
org.apache.flink.api.common.typeinfo.TypeInformation[(Int, String)]
val worksetInput = env.fromElements((2, "2"))
Error:(34, 40) not enough arguments for method fromElements: (implicit
evidence$14: scala.reflect.ClassTag[(Int, String)], implicit evidence$15:
org.apache.flink.api.common.typeinfo.TypeInformation[(Int,
String)])org.apache.flink.api.scala.DataSet[(Int, String)].
Unspecified value parameter evidence$15.
val worksetInput = env.fromElements((2, "2"))
Error:(47, 41) could not find implicit value for evidence parameter of type
org.apache.flink.api.common.typeinfo.TypeInformation[(Int, String)]
val solutionInput = env.fromElements((1, "1"))

have tried
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html#type-information-in-the-scala-api


thanks


[no subject]

2020-05-21 Thread 王立杰


[no subject]

2020-01-21 Thread Ankush Khanna




[no subject]

2019-10-21 Thread Utopia


[no subject]

2019-09-10 Thread Ben Yan
The following is the environment I use:
1. flink.version: 1.9.0
2. java version "1.8.0_212"
3. scala version: 2.11.12

When I wrote the following code in the scala programming language, I found
the following error:

// set up the batch execution environment
val bbSettings =
EnvironmentSettings.newInstance().useBlinkPlanner().inBatchMode().build()
val bbTableEnv = TableEnvironment.create(bbSettings)

error: Static methods in interface require -target:jvm-1.8
[ERROR] val bbTableEnv = TableEnvironment.create(bbSettings)

But when I use the java programming language or the version of scala
in 2.12, there is no problem.

If I use the version of scala2.11, is there any way to solve this
problem? thanks


Best,

Ben


[no subject]

2019-07-17 Thread tangkailin
Hello,
I am trying to use HashMap In my window function of flink job. if the 
parallelism change, is this hashmap still a singleton? Shouldn’t  I do 
something similar here?


发送自 Windows 10 版邮件应用



[no subject]

2019-07-01 Thread wang xuchen
Hi Flink experts,

I am prototyping a real time system that reads from Kafka source with Flink
and calls out to an external system as part of the event processing. One of
the most important requirements are read from Kafka should NEVER stall,
even in face of some async external calls slowness while holding certain
some kafka offsets. At least once processing is good enough.

Currently, I am using AsyncIO with a thread pool of size 20. My
understanding is if I use orderedwait with a large 'capacity', consumption
from Kafka should continue even if some external calls experience slowness
(holding the offsets) as long as the capacity is not exhausted.

(From my own reading of Flink source code, the capacity of the orderedwait
function translate to the size of the OrderedStreamElementQueue size.)

However, I expect that while the external calls stuck, stream source should
keep pumping out from Kafka as long as there is still capacity, but offset
after the stuck record should NOT be committed back to Kafka and (the
checkpoint should also stall to accomodate the stalled offests?)

My observation is, if I set the capacity large enough (max_int / 100 for
instance), the consumption was not stalled (which is good), but the offsets
were all committed back to Kafka AFTER the stalled records and all
checkpoint succeeded, no back pressure was incurred.

In this case, if some machines crash, how does Flink recover the stalled
offsets? Which checkpoint does Flink rollback to?  I understand that
commiting offset back to Kafka is merely to show progress to external
monitoring tool, but I hope Flink does book keeping somewhere to journal
async call xyz is not return and should be retried during recovery.

==

I`ve done a some more experiments, looks like Flink is able to recover the
record which I threw completeExceptionly even if I use 'unorderedwait' on
the async stream.

Which leads to Fabian`s early comments, 'Flink does not rely on Kafka
consumer offset to recover, committing offset to Kafka is merely to show
progress to external monitoring tools'.

I couldn`t pinpoint the code that Flink uses the achieve it, maybe
in-flight async invokations in 'unorderedstreamelementqueue' are part of
the checkpoint and Flink saves the actual payload for later replay?

Can anyone cast some lights?


[no subject]

2019-05-10 Thread an0
> Q2: after a, map(A), and map(B) would work fine. Assign watermarks
> immediatedly after a keyBy() is not a good idea, because 1) the records are
> shuffled and it's hard to reasoning about ordering, and 2) you lose the
> KeyedStream property and would have to keyBy() again (unless you use
> interpreteAsKeyedStream).

I know it is better to assignTimestampsAndWatermarks as early as possible. I 
intentionally put it after keyBy to check my understanding of this specific 
situation because I may have to use assignTimestampsAndWatermarks after keyBy 
in my application. I didn't make my question's intention clear though.

I'm glad to learn another trick from you: reinterpretAsKeyedStream :). Let's 
assume it is keyBy. 
assignTimestampsAndWatermarks.reinterpretAsKeyedStream.timeWindow(C).

I wanted to ask:
Because it is using keyed windows, every key's window is fired independently, 
even if I assignTimestampsAndWatermarks after keyBy and C.2 doesn't have any 
data so generates no watermarks, it won't affect the firing of C.1's windows. 
Is this understand correct?

> Although C.2 does not receive data, it receives watermarks because WMs are
> broadcasted. They flow to any task that is reachable by any event. The case
> that all keys fall into C.1 is not important because a record for C.2 might
> arrive at any point in time. Also Flink does does not give any guarantees
> about how keys (or rather key groups) are assigned to tasks. If you rescale
> the application to a parallelism of 3, the active key group might be
> scheduled to C.2 or C.3.
> 
> Long story short, D makes progress in event time because watermarks are
> broadcasted.

I know watermarks are broadcasted. But I'm using assignTimestampsAndWatermarks 
after keyBy, so C.2 doesn't receive watermarks, it creates watermarks from it's 
received data. Since it doesn't receive any data, it doesn't create any 
watermarks. D couldn't make progress because one of its inputs, C2, doesn't 
make progress. Is this understand correct?

On 2019/05/10 10:38:35, Fabian Hueske  wrote: 
> Hi,
> 
> Again answers below ;-)
> 
> Am Do., 9. Mai 2019 um 17:12 Uhr schrieb an0 :
> 
> > You are right, thanks. But something is still not totally clear to me.
> > I'll reuse your diagram with a little modification:
> >
> > DataStream a = ...
> > a.map(A).map(B).keyBy().timeWindow(C)
> >
> > and execute this with parallelism 2. However, keyBy only generates one
> > single key value, and assume they all go to C.1. Does the data flow look
> > like this?
> >
> > A.1 -- B.1 -/-- C.1
> > /
> > A.2 -- B.2 --/   C.2
> >
> > Will the lack of data into C.2 prevent C.1's windows from firing? Will the
> > location of assignTimestampsAndWatermarks(after a, after map(A), after
> > map(B), after keyBy) matter for the firing of C.1's windows
> 
> By my understanding, the answers are "no" and "no". Correct?
> >
> > Q1: no. Watermarks are propagated even in case of skewed key distribution.
> C.2 will also advance it's event-time clock (based on the WMs) and forward
> new WMs.
> Q2: after a, map(A), and map(B) would work fine. Assign watermarks
> immediatedly after a keyBy() is not a good idea, because 1) the records are
> shuffled and it's hard to reasoning about ordering, and 2) you lose the
> KeyedStream property and would have to keyBy() again (unless you use
> interpreteAsKeyedStream).
> 
> 
> > Now comes the *silly* question: does C.2 exist at all? Since there is only
> > one key value, only one C instance is needed. I could see that C.2 as a
> > physical instance may exist, but as a logical instance it shouldn't exist
> > in the diagram because it is unused. I feel the answer to this silly
> > question may be the most important in understanding my and(perhaps many
> > others') misunderstanding of situations like this.
> >
> > If C.2 exists just because parallelism is set to 2, even though it is not
> > logically needed, and it also serves as an input to the next operator if
> > there is one, then the mystery is completely solved for me.
> >
> > C.2 exists. Flink does not create a flow topology based on data values. As
> soon as there is a record with a key that would need to go to C.2, we would
> need it.
> 
> 
> > Use a concrete example:
> >
> > DataStream a = ...
> >
> > a.map(A).map(B).keyBy().assignTimestampsAndWatermarks(C).timeWindowAll(D)
> >
> > A.1 -- B.1 -/-- C.1 --\
> > / D
> > A.2 -- B.2 --/   C.2 --/
> >
> > D's watermark can not progress because C.2's watermark can not progress,
> > because C.2 doesn't have any input data, even though C.2 is not logically
> > needed but it does exist and it ruins everything :p. Is this understanding
> > correct?
> >
> 
> Although C.2 does not receive data, it receives watermarks because WMs are
> broadcasted. They flow to any task that is reachable by any event. The case
> that all keys fall into C.1 is not important because a record for C.2 might
> arrive at any 

[no subject]

2019-05-09 Thread an0
You are right, thanks. But something is still not totally clear to me. I'll 
reuse your diagram with a little modification:

DataStream a = ...
a.map(A).map(B).keyBy().timeWindow(C)

and execute this with parallelism 2. However, keyBy only generates one single 
key value, and assume they all go to C.1. Does the data flow look like this?

A.1 -- B.1 -/-- C.1
/
A.2 -- B.2 --/   C.2

Will the lack of data into C.2 prevent C.1's windows from firing? Will the 
location of assignTimestampsAndWatermarks(after a, after map(A), after map(B), 
after keyBy) matter for the firing of C.1's windows?

By my understanding, the answers are "no" and "no". Correct?

Now comes the *silly* question: does C.2 exist at all? Since there is only one 
key value, only one C instance is needed. I could see that C.2 as a physical 
instance may exist, but as a logical instance it shouldn't exist in the diagram 
because it is unused. I feel the answer to this silly question may be the most 
important in understanding my and(perhaps many others') misunderstanding of 
situations like this.

If C.2 exists just because parallelism is set to 2, even though it is not 
logically needed, and it also serves as an input to the next operator if there 
is one, then the mystery is completely solved for me.

Use a concrete example:

DataStream a = ...
a.map(A).map(B).keyBy().assignTimestampsAndWatermarks(C).timeWindowAll(D)

A.1 -- B.1 -/-- C.1 --\
/ D
A.2 -- B.2 --/   C.2 --/

D's watermark can not progress because C.2's watermark can not progress, 
because C.2 doesn't have any input data, even though C.2 is not logically 
needed but it does exist and it ruins everything :p. Is this understanding 
correct?

On 2019/05/09 10:01:44, Fabian Hueske  wrote: 
> Hi,
> 
> Please find my response below.
> 
> Am Fr., 3. Mai 2019 um 16:16 Uhr schrieb an0 :
> 
> > Thanks, but it does't seem covering this rule:
> > --- Quote
> > Watermarks are generated at, or directly after, source functions. Each
> > parallel subtask of a source function usually generates its watermarks
> > independently. These watermarks define the event time at that particular
> > parallel source.
> >
> > As the watermarks flow through the streaming program, they advance the
> > event time at the operators where they arrive. Whenever an operator
> > advances its event time, it generates a new watermark downstream for its
> > successor operators.
> >
> > Some operators consume multiple input streams; a union, for example, or
> > operators following a keyBy(…) or partition(…) function. Such an operator’s
> > current event time is the minimum of its input streams’ event times. As its
> > input streams update their event times, so does the operator.
> > --- End Quote
> >
> > The most relevant part, I believe, is this:
> > "Some operators consume multiple input streams…operators following a
> > keyBy(…) function. Such an operator’s current event time is the minimum of
> > its input streams’ event times."
> >
> > But the wording of "current event time is the minimum of its input
> > streams’ event times" actually implies that the input streams(produced by
> > keyBy) have different watermarks, the exactly opposite of what you just
> > explained.
> >
> >
> IMO, the description in the documentation is correct, but looks at the
> issue from a different angle.
> An operator task typically has many input from which it receives records.
> Depending on the number of input operators (one ore more) and the
> connection between the operator and its input operators (forward,
> partition, broadcast), an operator task might have a connection to one,
> some, or all tasks of its input operators. Each input task can send a
> different watermark, but each task will also send the same watermark to all
> its output tasks.
> 
> So, this is a matter of distinguishing receiving (different) watermarks and
> emitting (the same) watermarks.
> 
> Best, Fabian
> 
> 
> > On 2019/05/03 07:32:07, Fabian Hueske  wrote:
> > > Hi,
> > >
> > > this should be covered here:
> > >
> > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/event_time.html#watermarks-in-parallel-streams
> > >
> > > Best, Fabian
> > >
> > > Am Do., 2. Mai 2019 um 17:48 Uhr schrieb an0 :
> > >
> > > > This explanation is exactly what I'm looking for, thanks! Is such an
> > > > important rule documented anywhere in the official document?
> > > >
> > > > On 2019/04/30 08:47:29, Fabian Hueske  wrote:
> > > > > An operator task broadcasts its current watermark to all downstream
> > tasks
> > > > > that might receive its records.
> > > > > If you have an the following code:
> > > > >
> > > > > DataStream a = ...
> > > > > a.map(A).map(B).keyBy().window(C)
> > > > >
> > > > > and execute this with parallelism 2, your plan looks like this
> > > > >
> > > > > A.1 -- B.1 --\--/-- C.1
> > > > >   X
> > > > > A.2 -- B.2 --/--\-- C.2
> > > > >
> > > > > 

[no subject]

2019-03-20 Thread Puneet Kinra
user@flink.apache.org
-- 
*Cheers *

*Puneet Kinra*

*Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com
*

*e-mail :puneet.ki...@customercentria.com
*


[no subject]

2018-11-27 Thread Henry Dai
Hi,
Is there a way to get table's metadata in flink?
if I emit a table to kafka, then how can I know the table columns when I
subscribe the kafka topic and restore the table using
*tableEnv.registerDataStream("t1",
source, "field1, field2 ...")  *in another flink program?

Flink should provide something like Hive's metastore to keep metadata of
tables.

-- 
best wishes
hengyu


[no subject]

2018-11-15 Thread Steve Bistline
Well, hopefully the last problem with this project.

Any thoughts would be appreciated

=

java.lang.RuntimeException: Exception occurred while processing valve
output watermark:
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:265)
at 
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)
at 
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:184)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:103)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Failure happened in filter function.
at org.apache.flink.cep.nfa.NFA.createDecisionGraph(NFA.java:704)
at org.apache.flink.cep.nfa.NFA.computeNextStates(NFA.java:500)
at org.apache.flink.cep.nfa.NFA.process(NFA.java:252)
at 
org.apache.flink.cep.operator.AbstractKeyedCEPPatternOperator.processEvent(AbstractKeyedCEPPatternOperator.java:332)
at 
org.apache.flink.cep.operator.AbstractKeyedCEPPatternOperator.lambda$onEventTime$0(AbstractKeyedCEPPatternOperator.java:235)
at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at 
java.util.stream.ReferencePipeline$Head.forEachOrdered(ReferencePipeline.java:590)
at 
org.apache.flink.cep.operator.AbstractKeyedCEPPatternOperator.onEventTime(AbstractKeyedCEPPatternOperator.java:234)
at 
org.apache.flink.streaming.api.operators.HeapInternalTimerService.advanceWatermark(HeapInternalTimerService.java:288)
at 
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:108)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:734)
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:262)
... 7 more
Caused by: java.lang.NullPointerException


Re: Taskmanager SSL fails looking for Subject Alternative IP Address

2018-07-13 Thread Stephan Ewen
Thanks for reporting this.

Given that hostname verification seems to be the issue, I would assume that
the TaskManager somehow advertises a hostname in a form that is incompatile
with the verification in some setups.

While it would be interesting to dig deeper into why this happens, I think
we need to move away from hostname verification for internal communication
(rpc, TaskManager Netty, blob server) anyways for the following reasons:

  - Hostname verification is hard (or pretty much incompatible) between
containers in many container environments
  - The verification is mainly useful if you use a certificate in a
certification chain with some other trusted root certificates
  - For internal SSL between JM/TM and TM/TM, the recommended method is to
generate a single purpose certificate (may be self signed) and add a key
store and trust store with only that certificate. Given such a "single
certificate truststore", hostname verification does not add any additional
security (to my understanding).

For Flink 1.6, we are also adding transparent mutual authentication for
internal communication (RPC; blob server, netty data plane), which should
be an additional level of security. If this is uses with dedicated (self
signed) certificates, it should be very secure and not rely on hostname
verification.

That said, for external communication (REST calls against
JM/Dispatcher/...) clients should use hostname verification, because many
users use certificates in a certificate chain for these external endpoints.

Best,
Stephan



On Thu, Jul 12, 2018 at 11:02 PM, PACE, JAMES  wrote:

> I have the following SSL configuration for a 3 node HA flink cluster:
>
>
>
> #taskmanager.data.ssl.enabled: false
>
> security.ssl.enabled: true
>
> security.ssl.keystore: /opt/app/certificates/server-keystore.jks
>
> security.ssl.keystore-password: 
>
> security.ssl.key-password: 
>
> security.ssl.truststore: /opt/app/certificates/cacerts
>
> security.ssl.truststore-password: 
>
> security.ssl.verify-hostname: true
>
>
>
> The job we’re running is the sample WordCount.jar.  The running version of
> flink is 1.4.0.  It’s not the latest, but I didn’t see anything that looked
> like updating would solve this issue.
>
>
>
> If either security.ssl.verify-hostname is set to false or
> taskmanager.data.ssl.enabled is set to false, everything works fine.
>
>
>
> When flink is run in the above configuration above, with ssl fully enabled
> and security.ssl.verify-hostname: true, the flink job fails.  However, when
> going through the logs, SSL appears fine for akka, blob service, and
> jobmanager.
>
>
>
> The root cause looks to be Caused by: java.security.cert.CertificateException:
> No subject alternative names matching IP address xxx.xxx.xxx.xxx found.
>
> I have tried setting taskmanager.hostname to the FQDN of the host, but
> that did not change anything.
>
> We don’t generate certificates with SAN fields.
>
>
>
> Any thoughts would be appreciated.
>
>
>
> This is the full stack trace
>
> Caused by: java.io.IOException: Thread 'SortMerger Reading Thread'
> terminated due to an exception: Sending the partition request failed.
>
> at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$
> ThreadBase.run(UnilateralSortMerger.java:800)
>
> Caused by: 
> org.apache.flink.runtime.io.network.netty.exception.LocalTransportException:
> Sending the partition request failed.
>
> at org.apache.flink.runtime.io.network.netty.
> PartitionRequestClient$1.operationComplete(PartitionRequestClient.java:
> 119)
>
> at org.apache.flink.runtime.io.network.netty.
> PartitionRequestClient$1.operationComplete(PartitionRequestClient.java:
> 111)
>
> at org.apache.flink.shaded.netty4.io.netty.util.
> concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>
> at org.apache.flink.shaded.netty4.io.netty.util.
> concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
>
> at org.apache.flink.shaded.netty4.io.netty.util.
> concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
>
> at org.apache.flink.shaded.netty4.io.netty.channel.
> PendingWriteQueue.safeFail(PendingWriteQueue.java:252)
>
> at org.apache.flink.shaded.netty4.io.netty.channel.
> PendingWriteQueue.removeAndFailAll(PendingWriteQueue.java:112)
>
> at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.
> setHandshakeFailure(SslHandler.java:1256)
>
> at org.apache.flink.shaded.netty4.io.netty.handler.ssl.
> SslHandler.unwrap(SslHandler.java:1040)
>
> at org.apache.flink.shaded.netty4.io.netty.handler.ssl.
> SslHandler.decode(SslHandler.java:934)
>
>  

Taskmanager SSL fails looking for Subject Alternative IP Address

2018-07-12 Thread PACE, JAMES
I have the following SSL configuration for a 3 node HA flink cluster:

#taskmanager.data.ssl.enabled: false
security.ssl.enabled: true
security.ssl.keystore: /opt/app/certificates/server-keystore.jks
security.ssl.keystore-password: 
security.ssl.key-password: 
security.ssl.truststore: /opt/app/certificates/cacerts
security.ssl.truststore-password: 
security.ssl.verify-hostname: true

The job we're running is the sample WordCount.jar.  The running version of 
flink is 1.4.0.  It's not the latest, but I didn't see anything that looked 
like updating would solve this issue.

If either security.ssl.verify-hostname is set to false or 
taskmanager.data.ssl.enabled is set to false, everything works fine.

When flink is run in the above configuration above, with ssl fully enabled and 
security.ssl.verify-hostname: true, the flink job fails.  However, when going 
through the logs, SSL appears fine for akka, blob service, and jobmanager.

The root cause looks to be Caused by: java.security.cert.CertificateException: 
No subject alternative names matching IP address xxx.xxx.xxx.xxx found.
I have tried setting taskmanager.hostname to the FQDN of the host, but that did 
not change anything.
We don't generate certificates with SAN fields.

Any thoughts would be appreciated.

This is the full stack trace
Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated 
due to an exception: Sending the partition request failed.
at 
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:800)
Caused by: 
org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: 
Sending the partition request failed.
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClient$1.operationComplete(PartitionRequestClient.java:119)
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClient$1.operationComplete(PartitionRequestClient.java:111)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at 
org.apache.flink.shaded.netty4.io.netty.channel.PendingWriteQueue.safeFail(PendingWriteQueue.java:252)
at 
org.apache.flink.shaded.netty4.io.netty.channel.PendingWriteQueue.removeAndFailAll(PendingWriteQueue.java:112)
at 
org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1256)
at 
org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1040)
at 
org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:934)
at 
org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:315)
at 
org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:229)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1431)
at 
sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at 
org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1114)
at 
org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap

[no subject]

2018-06-21 Thread Vinod Gavhane


Regards,
Vinod Gavhane



Subject: Last chance to register for Flink Forward SF (April 10). Get 25% discount

2018-03-29 Thread Stephan Ewen
Hi all!

There are still some spots left to attend Flink Forward San Francisco, so
sign up soon before registration closes.

Use this promo code to get 25% off: MailingListFFSF

The 1-day conference takes place on Tuesday, April 10 in downtown SF. We
have a great lineup of speakers from companies working with Flink,
including Alibaba, American Express, Capital One, Comcast, eBay, Google,
Lyft, MediaMath, Netflix, Uber, Walmart Labs, and many others. See the full
program of sessions and speakers:
https://sf-2018.flink-forward.org/conference/

Also on Monday, April 9 we'll be holding training sessions for Flink
(Standard and Advance classes) - it's good opportunity to learn from some
Apache Flink experts.

Hope to see you there!

Best,
Stephan


[no subject]

2017-12-20 Thread chris snow




[no subject]

2017-10-03 Thread Aniket Deshpande
-- 
Yours Sincerely,
Aniket S Deshpande.


[no subject]

2016-12-20 Thread Abiy Legesse Hailemichael
I am running a standalone flink cluster (1.1.2) and I have a stateful
streaming job that uses RocksDB as a state manager. I have two stateful
operators that are using ValueState<> and ListState<>. Every now and then
my job fails with the following exception

Caused by: AsynchronousException{java.io.FileNotFoundException: File
file:/data/flink/checkpoints/471ef8996921bb9c29434abf35490a26/StreamMap_12_0/dummy_state/chk-4
does not exist}
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointThread.run(StreamTask.java:870)
Caused by: java.io.FileNotFoundException: File
file:/data/flink/checkpoints/471ef8996921bb9c29434abf35490a26/StreamMap_12_0/dummy_state/chk-4
does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at
org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1467)
at
org.apache.flink.contrib.streaming.state.RocksDBStateBackend$FinalSemiAsyncSnapshot.getStateSize(RocksDBStateBackend.java:688)
at
org.apache.flink.streaming.runtime.tasks.StreamTaskStateList.getStateSize(StreamTaskStateList.java:89)
at
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointThread.run(StreamTask.java:860)



Abiy Hailemichael
Software Engineer
Phone: (202) 355-8933
Email: abiybirtu...@gmail.com 


[no subject]

2016-11-13 Thread Thomas FOURNIER
Hello,

I'm trying to assign a unique (and deterministic) ID to a globally sorted
DataSet.

Given a DataSet of String, I can compute the frequency of each label as
follows:

val env = ExecutionEnvironment.getExecutionEnvironment

val data = 
env.fromCollection(List("a","b","c","a","a","d","a","a","a","b","b","c","a","c","b","c"))

val mapping = data.map(s => (s,1))
.groupBy(0)
  .reduce((a,b) => (a._1, a._2 + b._2))
  .partitionByRange(1)
  .sortPartition(1, Order.DESCENDING)




I want the most frequent label to be ID 0 and so on in decreasing order. My
idea was to use zipWithIndex. But this does not guarantee that my DataSet
will be


[no subject]

2016-07-17 Thread Chen Bekor
Hi,

I Need some assistance -

I’m trying to globally register arguments from my main function for further
extraction on stream processing nodes. My code base is Scala:

val env = StreamExecutionEnvironment.getExecutionEnvironment

val parameterTool = ParameterTool.fromArgs(args)

env.getConfig.setGlobalJobParameters(parameterTool)

but when I'm trying to retrieve them I get null pointer exception.

private lazy val parameters: GlobalJobParameters =
ExecutionEnvironment.getExecutionEnvironment.getConfig.getGlobalJobParameters

Have read this article
https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/best_practices.html
and I’m curious if it is possible.

This is required in order to read from a configuration file holding DB
Connection String (Bootstrapping DB connections on each processing node on
the bootstrapping phase.

Regards,
Chen.


[no subject]

2016-04-16 Thread Ahmed Nader
Hello,
I'm new to flink so this might seem a basic question. I added flink to an
existing project using maven and can run the program locally with
StreamExecutionEnvironment with no problems, however i want to know how can
I submit jobs for that project and be able to view these jobs from flink's
web interface and run these jobs, while i don't have the flink/bin folder
in my project structure as i only added the dependencies.
Thanks.


[no subject]

2016-02-23 Thread Zach Cox
Hi - I typically use the Chrome browser on OS X, and notice that with
1.0.0-rc0 the job graph visualization displays the nodes in the graph, but
not any of the edges. Also the graph does not move around when dragging the
mouse.

The job graph visualization seems to work perfectly in Safari and Firefox
on OS X, however.

Is this a known issue or should I open a jira ticket?

Thanks,
Zach


[no subject]

2015-10-19 Thread Jakob Ericsson
Hello,

We are running into a strange problem with Direct Memory buffers. From what
I know, we are not using any direct memory buffers inside our code.
This is pretty trivial streaming application just doing some dedupliction
and union some kafka streams.

/Jakob



2015-10-19 13:27:59,064 INFO  org.apache.flink.runtime.taskmanager.Task
- FilterAndTransform -> (Filter, Filter) (3/4) switched to
FAILED with exception.
org.apache.flink.runtime.io.network.netty.exception.LocalTransportException:
java.lang.OutOfMemoryError: Direct buffer memory
at
org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.exceptionCaught(PartitionRequestClientHandler.java:153)
at
io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
at
io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224)
at
io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
at
io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
at
io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224)
at
io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
at
io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
at
io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:737)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:310)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.netty.handler.codec.DecoderException:
java.lang.OutOfMemoryError: Direct buffer memory
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:234)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
... 9 more
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at
io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:651)
at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:237)
at io.netty.buffer.PoolArena.allocate(PoolArena.java:215)
at io.netty.buffer.PoolArena.reallocate(PoolArena.java:358)
at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:111)
at
io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251)
at
io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849)
at
io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841)
at
io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831)
at
io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:228)
... 10 more


[no subject]

2015-04-24 Thread Pa Rö
user-sc.1429880470.
oeiopbmoofcapkjibfab-paul.roewer1990=googlemail@flink.apache.org