at 8:14 AM Sebastian Zapata
wrote:
> Hi everyone wanted to check if somebody else has found an issue like
> this?, and could maybe have some pointer or has encountered this before
>
> my kyro serializer is failing to serialize
> com.google.protobuf.UnmodifiableLazyStringList
>
tive access operation has occurred
WARNING: Illegal reflective access by com.esotericsoftware.kryo.util.UnsafeUtil (file:/home/sebastian/.gradle/caches/modules-2/files-2.1/com.esotericsoftware/kryo/4.0.2/e38ab79c96b0c8600c8ac38cc81dab935f0abac9/kryo-4.0.2.jar) to constructor java.nio.DirectByteBuffer(
nts on what to check/change are highly appreciated.
best,
Sebastian S.
[image: image.png]
Hi Guillaume,
thank you for this great hint! It indeed fixed the mentioned issue.
Just from reading the changelog of 1.14.4 i would not have known that this
fix is included, maybe i was searching for the wrong stuff though.
Have a great day!
Sebastian
On Fri, Mar 25, 2022 at 10:51 AM Guillaume
link-dist_2.12-1.14.2.jar:1.14.2]
at
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
[flink-dist_2.12-1.14.2.jar:1.14.2]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
"""
What can i do to track this down?
best regards,
Sebastian
Thanks a lot Timo,
I will check those links out and create an issue with more information.
Best Regards,
Sebastian
From: Timo Walther
Sent: Tuesday, March 16, 2021 15:29
To: Magri, Sebastian ; ro...@apache.org
Cc: user
Subject: Re: [Flink SQL] Leniency of
I validated it's still accepted by the connector but it's not in the
documentation anymore.
It doesn't seem to help in my case.
Thanks,
Sebastian
____
From: Magri, Sebastian
Sent: Friday, March 12, 2021 18:50
To: Timo Walther ; ro...@apache.org
Cc
Hi Roman!
Seems like that option is no longer available.
Best Regards,
Sebastian
From: Roman Khachatryan
Sent: Friday, March 12, 2021 16:59
To: Magri, Sebastian ; Timo Walther
Cc: user
Subject: Re: [Flink SQL] Leniency of JSON parsing
Hi Sebastian,
Did you
I'm trying to extract data from a Debezium CDC source, in which one of the
backing tables has an open schema nested JSON field like this:
"objectives": {
"items": [
{
"id": 1,
"label": "test 1"
"size": 1000.0
},
{
"id":
gt;
--
*With kind regards
----
Sebastian Liu 刘洋
Institute of Computing Technology, Chinese Academy of Science
Mobile\WeChat: +86—15201613655
E-mail: liuyang0...@gmail.com
QQ: 3239559*
I don't think you need to employ a distributed system for working with this
dataset. An SGD implementation on a single machine should easily handle the
job.
Best,
Sebastian
2017-07-12 9:26 GMT+02:00 Andrea Spina :
> Dear Ziyad,
>
> Yep, I had encountered same very long runtimes wi
rcumvent this restriction (sorting step?) or
otherwise optimize the process?
> Can you check in the code whether it is enabled? You'll have to go
> through a bit of the code to see that.
Although, I'm not deeply involved with Flink's internal sourcecode, I'll
try m
s reliably reproduce the issue.
The best,
Sebastian
[1]
http://paste.gehaxelt.in/?2757c33ed3a3733b#jHQPPQNKKrE2wq4o9KCR48m+/V91S55kWH3dwEuyAkc=
[2]
http://paste.gehaxelt.in/?b106990deccecf1a#y22HgySqCYEOaP2wN6xxApGk/r4YICRkLCH2HBNN9yQ=
the aforementioned Flink settings does not
resolve the problem.
I guess, I need to debug this some more.
Best,
Sebastian
aSource (at
> createInput(ExecutionEnvironment.java:552)
> (org.apache.flink.api.java.hadoop.mapred.HadoopInputFormat)) ->
> Combine(Distinct at parseDumpData(SkipDumpParser.java:43)) (28/40)
> java.io.IOException: Cannot write record to fresh sort buffer. Record too
> large.
Best,
Sebastian
Hi Flavio,
thanks for pointing me to your old thread.
I don't have administrative rights on the cluster, but from what dmesg
reports, I could not find anything that looks like an OOM message.
So no luck for me, I guess...
Best,
Sebastian
Hi Ted,
thanks for bringing this to my attention.
I just rechecked my Java version and it is indeed version 8. Both the
code and the Flink environment run that version.
Cheers,
Sebastian
Hi Kurt,
thanks for the input.
What do you mean with "try to disable your combiner"? Any tips on how I
can do that?
I don't actively use any combine* DataSet API functions, so the calls to
the SynchronousChainedCombineDriver come from Flink.
Kind regards,
Sebastian
more
processed data before the failure.
The error messages and exceptions from an affected TaskManager are here
[1]. Unfortunately, I cannot find a java.lang.OutOfMemoryError in here.
Do you have another idea or something to try?
Thanks in advance,
Sebastian
[1]
http://paste.gehaxe
rminated due to an exception: null
A full stack trace can be found here [0].
I tried to reduce the taskmanager.memory.fraction (or so) and also the
amount of parallelism, but that did not help much.
Flink 1.0.3-Hadoop2.7 was used.
Any tipps are appreciated.
Kind regards,
Sebas
Hi,
thanks for the help!
Making the class fooo static did the trick.
I was just a bit confused, because I'm using a similar contruction
somewhere else in the code and it works flawlessy.
Best regards,
Sebastian
Hi,
I've heared of some methods that triggere an execution when using the
Batch API:
- print
- collect
- count
- execute
Some of them are discussed in older docs [0], but I can't find a good
list or hints in the newer ones. Are there any other methods?
Best regards,
Sebastian
apache.flink.api.java.DataSet.filter(DataSet.java:287)
> testpackage.testclass.applyFilters(testclass.java:105)
I'm a little bit confused, why Flink manages to serialize the "fooo2"
class, but not the "fooo" class. Is this is a bug or do I miss something
here?
Cheers,
Sebastian
an "A..B"-block) are within
the block's bounds.
I think I don't fully understand what happens when a record is split
between two or more blocks. Can a subtask, which for example handles
block 3, read into the fourth block to complete the record?
Cheers,
Sebastian
7;m also a bit confused why the Flink Doc says that Bzip2 is not
splittable? [6]
Afaik hadoop (and flink in compatibility mode) does support splittable,
compressed data.
I would appreciate some input/ideas/help with this.
All the best,
Sebastian
[0]:
https://github.com/apache/mahout/blob/master/integ
Wrapper :
> java.io.NotSerializableException:
> org.apache.flink.api.java.operators.JoinOperator$EquiJoin
I'm using flink-1.1.4-hd27.
Any ideas how I can fix that bug? It did properly work with a simple .join()
Regards,
Sebastian
at sounds a bit over-complicated.
Best regards,
Sebastian
essed data, the XmlInputFormat
can't for some reason.
Is there a flink-ish way to accomplish this?
Best regards,
Sebastian
[0]:
https://github.com/apache/mahout/blob/ad84344e4055b1e6adff5779339a33fa29e1265d/examples/src/main/java/org/apache/mahout/classifier/bayes/XmlInputFormat.java
Hi Chesnay,
thanks for the input. Finding a word's first occurrence is part of the
algorithm.
To be exact I'm trying to implement Adler's Text authorship tracking in
flink (http://www2007.org/papers/paper692.pdf, page 266).
Thanks,
Sebastian
of 1 and a .map(...) ?
However, this approach would collect all data at one node and wouldn't
scale, correct?
Regards,
Sebastian
taSet?
- Is there a way to check if an element exists in a DataSet? E.g.
DataSet<>.contains(elem)?
Best regards,
Sebastian
ent.
That's a bit ugly and probably performancewise not the best solution,
but I haven't found a "Flink-ish"-way to do it either.
Regards,
Sebastian
ctory
I fixed this by adding "implements java.io.Serializable" to the
IDataFactory (and all other interfaces right away) - I hope that won't
backfire in the future.
Anyway, the problem seems solved. Yay and thank you!
Kind regards,
Sebastian
k, so passing a configuration object to all functions/classes would
be overkill, I guess.
Thanks again and kind regards,
Sebastian
link-factorytest/blob/master/src/main/java/factorytest/Job.java
Config.java:
https://gitlab.tubit.tu-berlin.de/gehaxelt/flink-factorytest/blob/master/src/main/java/factorytest/Config.java
Example run:
https://gitlab.tubit.tu-berlin.de/gehaxelt/flink-factorytest/blob/master/EXAMPLE_RUN_OUTPUT.txt
Kind regards,
Sebastian Neef
here have experience with such things? I'm thinking of
connecting Flink to a lighweight in-memory key-value store such as
memcache for that.
Best,
Sebastian
You should also add Apache Mahout, whose new Samsara DSL also runs on Flink.
-s
On 06.04.2016 08:50, Henry Saputra wrote:
Thanks, Slim. I have just updated the wiki page with this entries.
On Tue, Apr 5, 2016 at 10:20 AM, Slim Baltagi mailto:sbalt...@gmail.com>> wrote:
Hi
The followi
hanks,
Sebastian
From: ewenstep...@gmail.com on behalf of Stephan Ewen
Sent: Wednesday, December 9, 2015 11:15
To: user@flink.apache.org
Subject: Re: Taskmanager memory
Off heap memory is freed when the memory consuming operators release the memory.
The Java pr
your help!
Cheers,
Sebastian
These marketing-style comparisons are almost always either wrong, or
heavily tainted to make one particular project look good. They are
pretty bad style for Apache projects, in my opinion. For that reason, we
try to make more technical statements and comparisons.
+1
Is there a way to configure this setting for a delta iteration in the
scala API?
Best,
Sebastian
On 17.06.2015 10:04, Ufuk Celebi wrote:
On 17 Jun 2015, at 09:35, Mihail Vieru wrote:
Hi,
I had the same problem and setting the solution set to unmanaged helped:
VertexCentricConfiguration
advantage that you can guarantee the sizes of the samples
and can easily support more involved techniques such as sampling with
replacement.
--sebastian
On 24.06.2015 10:38, Maximilian Alber wrote:
That's not the point. In Machine Learning one often divides a data set X
into f.e. three sets, on
Hi Ufuk,
Can I configure this when running locally in the IDE or do I have to
install Flink for that?
Best,
Sebastian
On 17.06.2015 09:34, Ufuk Celebi wrote:
Hey Sebastian,
with "taskmanager.memory.fraction" you can give more memory to the Flink
runtime. Current default is to g
Hi,
Flink has memory problems when I run an algorithm from my local IDE on a
2GB graph. Is there any way that I can increase the memory given to Flink?
Best,
Sebastian
Caused by: java.lang.RuntimeException: Memory ran out. numPartitions: 32
minPartition: 4 maxPartition: 4 number of overflow
scope of the function).
Cheers,
Sebastian
From: Till Rohrmann [mailto:trohrm...@apache.org]
Sent: Montag, 15. Juni 2015 14:16
To: user@flink.apache.org
Subject: Re: Random Selection
Hi Max,
the problem is that you’re trying to serialize the companion object of
scala.util.Random. Try to create
, I noticed the PKG strategy and
maybe it will come in handy in some other place :)
So, thanks again for the pointers!
Cheers,
Sebastian
From: Gianmarco De Francisci Morales [mailto:g...@apache.org]
Sent: Freitag, 12. Juni 2015 19:02
To: user@flink.apache.org
Subject: Re: Load balancing
Hi
That example was written in Scala, if you are using Java, then the join
function is applied with the function "with(..)". However, logically you do the
same in Scala and Java, there a just some minor differences in the API.
-Original Message-
From: hagersaleh [mailto:loveallah1...@yahoo.
application-specific differences is not necessary.” Maybe, I am missing
something, but doesn’t this assumption render PKG inapplicable to my case?
Objections to that are of course welcome :)
Cheers,
Sebastian
From: Gianmarco De Francisci Morales [mailto:g...@apache.org]
Sent: Mittwoch, 10. Juni 2015 15
.join(departments.filter(_.location =
...).where("department_id").equalTo("department_id").apply( (employee,
department) => employee.last_name)
Cheers,
Sebastian
[1] http://en.wikipedia.org/wiki/Relational_algebra
-Original Message-
From: hagersaleh [mailto:loveallah1...@yaho
but only a few elements. So is there a way of telling within
the partitioner, that data should reside on the same task manager? Thanks!
Cheers,
Sebastian
merge join. This will be slow with very many
duplicate keys, but should not break.
Let me know how it works!
On Tue, May 26, 2015 at 10:22 PM, Sebastian mailto:s...@apache.org>> wrote:
Hi,
What can I do to give Flink more memory when running it from my IDE?
I'm getting th
Hi,
What can I do to give Flink more memory when running it from my IDE? I'm
getting the following exception:
Caused by: java.lang.RuntimeException: Hash join exceeded maximum number
of recursions, without reducing partitions enough to be memory resident.
Probably cause: Too many duplicate k
If the system has to decide data shipping strategies for a join (e.g.,
broadcasting one side) it helps to have good estimates of the input sizes.
On 04.05.2015 14:53, Flavio Pompermaier wrote:
Thanks Sebastian and Fabian for the feedback, just one last question:
what does change from the
filter + project is easier to understand for the system, as the number
of output tuples is guaranteed to be <= the number of input tuples. With
flatMap, the system cannot know an upper bound.
--sebastian
On 04.05.2015 14:43, Flavio Pompermaier wrote:
Hi Flinkers,
I'd like to know
Hi Hung,
A broadcast variable can also refer to an intermediate result of a Flink
computation.
Best,
Sebastian
On 25.04.2015 21:10, HungChang wrote:
Hi,
What would be the difference between using global variable and broadcasting
it?
A toy example:
// Using global
{{...
private static int
/tuple but is the "top-level element"?
Cheers,
Sebastian
maybe let them start with Spark. According to the 3
month release cycle, the next version would be to come soon. So, I wanted to
ask if you have any concrete/rough release date planned already.
Cheers,
Sebastian
---
Sebastian Kruse
Doktorand am Fachbereich Information Systems Group
Hasso-Plattner
espect to their shift
value and join both grouped subintervals. Then compute the correlation.
This again only works if the grouped data can be kept on the heap of the
task manager.
On Tue, Apr 7, 2015 at 1:29 PM, Sebastian mailto:s...@apache.org>> wrote:
How large are the individual
There are still some "Berlin Buzzwords" snippets in your texts ;)
http://flink-forward.org/?page_id=294
On 07.04.2015 14:24, Kostas Tzoumas wrote:
Hi everyone,
The folks at data Artisans and the Berlin Big Data Center are organizing
the first physical conference all about Apache Flink in Berli
How large are the individual time series?
-s
On 07.04.2015 12:42, Kostas Tzoumas wrote:
Hi everyone,
I'm forwarding a private conversation to the list with Mats' approval.
The problem is how to compute correlation between time series in Flink.
We have two time series, U and V, and need to com
Let me once you have something to play with, than I will try to port my
graph-related code to it.
Best,
Sebastian
On 24.03.2015 09:08, Vasiliki Kalavri wrote:
Hi all,
there is no Scala API for Gelly yet and no corresponding JIRA either.
It's definitely in our plans, just not fo
Is gelly supposed to be usable from Scala? It looks as it is hardcoded
to use the Java API.
Best,
Sebastian
On 23.03.2015 23:15, Robert Metzger wrote:
Hi,
Gelly is not part of any offical flink release.
You have to use a Snapshot version of Flink if you want to try it out.
Sent from my
Maven seems to be unable to find the artifact, I also can't find it
under mvn repository:
http://mvnrepository.com/search?q=flink
Best,
Sebastian
On 23.03.2015 23:10, Andra Lungu wrote:
Hi Sebastian,
For me it works just as described there, with 0.9, but there should be
no problem for
Hi,
Is gelly already usable in the 0.8.1 release? I tried adding
org.apache.flink
flink-gelly
0.8.1
as described in [1], but my project fails to build.
Best,
Sebastian
[1] http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html
>>:
I reproduced the bug and will look into that.
Cheers, Fabian
2015-03-03 14:08 GMT+01:00 Sebastian mailto:ssc.o...@googlemail.com>>:
Hi I'm getting strange output paths for this piece of code:
computeDistribution(
"/home/ssc/Desktop/__
y name is repeated twice in the final
output:
/home/ssc/Desktop/trackthetrackers/out/trackerDistribution/trackerDistribution/
How does this come?
Best,
Sebastian
I don't have a build unfortunately, I'm using the maven dependency. I'll
try to find a workaround. Thx for your help.
-s
On 20.02.2015 12:44, Robert Metzger wrote:
Hey Sebastian,
I've fixed the issue in this branch:
https://github.com/rmetzger/flink/tree/flink15
emory (taskmanager.memory.size) in the flink-config.yaml [1].
Cheers, Fabian
[1] http://flink.apache.org/docs/0.8/config.html
On 20 Feb 2015, at 11:30, Sebastian wrote:
Hi,
I get a strange out of memory error from the serialization code when I try to
run the following program:
def compute(trackingGrap
Hi,
I get a strange out of memory error from the serialization code when I
try to run the following program:
def compute(trackingGraphFile: String, domainIndexFile: String,
outputPath: String) = {
implicit val env = ExecutionEnvironment.getExecutionEnvironment
val edges = GraphUtils.readEd
ution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
at java.lang.Thread.run(Thread.java:745)
I run the job locally, giving 2GB of Ram to the VM. The code will
produce less than 10 groups and the bitsets used internally should not
be larger than a few megabytes.
Any tips on how to fix this?
;))
vallines =
env.createHadoopInput(hadoopInput,classOf[LongWritable],classOf[Text], job)
lines.print
env.execute("Scala WordCount Example")
}
On Thu, Feb 19, 2015 at 9:56 PM, Sebastian mailto:ssc.o...@googlemail.com>> wrote:
I tried to follow the ex
MethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Any tips on how to proceed?
Be
Hi,
does flink support reading gzipped files? Haven't found any info about
this on the website.
Best,
Sebastian
Hi everyone,
I think that during one of the meetups, it was mentioned that Flink can in some
cases operate on serialized data. Given I understood that correctly, which
cases that would be, i.e, which data types and operators support such a feature?
Cheers,
Sebastian
---
Sebastian Kruse
In general, all-pairs-shortest-paths is a non-scalable problem as it
produces output proportional to the square of the number of vertices in
a network.
--sebastian
On 15.02.2015 12:37, Vasiliki Kalavri wrote:
Hi,
you can certainly use a for-loop like this to run SSSP several times.
Just
:40 AM, Kruse, Sebastian
mailto:sebastian.kr...@hpi.de>> wrote:
Thanks for your answers.
I am trying to build an apriori-like algorithm to find key candidates in a
relational dataset. I was considering delta iterations, because the algorithm
should maintain two datasets: a set of
nge in the future?
Cheers,
Sebastian
From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of Stephan
Ewen
Sent: Mittwoch, 11. Februar 2015 10:02
To: user@flink.apache.org
Subject: Re: DeltaIterations: shrink solution set
You can also use a bulk iteration and just keep the stat
?
My use case at hand for this is the following: In each iteration, I generate
candidate solutions that I want to verify within the next iteration. If
verification fails, I would like to remove them from the solution set,
otherwise retain them.
Thanks,
Sebastian
everywhere?
On Tue, Feb 10, 2015 at 12:09 PM, Sebastian mailto:ssc.o...@googlemail.com>> wrote:
I'm running it from within my IDE...
On 10.02.2015 11:17, Robert Metzger wrote:
Hi,
the NoSuchMethodError exception indicates that there is a mixup
Flink version?
On Tue, Feb 10, 2015 at 10:51 AM, Sebastian mailto:ssc.o...@googlemail.com>> wrote:
Yes, the import was missing thank you for the hint!
Now I'm getting the following error:
java.io.IOException: java.lang.NoSuchMethodError:
org.apache.flink.u
Yes, the import was missing thank you for the hint!
Now I'm getting the following error:
java.io.IOException: java.lang.NoSuchMethodError:
org.apache.flink.util.ClassUtils.isPrimitiveOrBoxedOrString(Ljava/lang/Class;)Z
at org.apache.flink.runtime.ipc.RPC$Server.call(RPC.java:428)
Hi,
I'm trying to write a simple join in flink-scala but for some reasons
flink fails to compile my code. I've tried several reformulations but
can't get rid of the error. Can you tell me how to fix this piece of
code? I'm using flin
82 matches
Mail list logo