Correction, I'm using Scala case classes not strictly Java POJOs just to be
clear.
On Fri, Oct 30, 2020 at 7:56 PM Rex Fenley wrote:
> Hello,
>
> I keep running into trouble moving between DataStream and SQL with POJOs
> because my nested POJOs turn into LEGACY('STRUCTURED_TYPE', is there any
>
Hello,
I keep running into trouble moving between DataStream and SQL with POJOs
because my nested POJOs turn into LEGACY('STRUCTURED_TYPE', is there any
way to convert them back to POJOs in Flink when converting a SQL Table back
to a DataStream?
Thanks!
--
Rex Fenley | Software Engineer - Mo
Hello,
Thank you for these examples, they look great. However, I can seem to import
`import org.apache.flink.table.planner.runtime.utils.{StreamingTestBase,
StringSink}`
is it because I'm using the Blink planner and not the regular one?
Thanks
On Fri, Oct 9, 2020 at 7:55 AM Timo Walther wrote:
Another factor to be aware of is that there's no cluster configuration file
to load (the mini-cluster doesn't know where to find flink-conf.yaml), and
by default you won't have a REST API endpoint or web UI available. But you
can do the configuration in the code, and/or provide configuration on the
It seems like another option here would be to occasionally use the state
processor API to purge a savepoint of all unnecessary state.
On Fri, Oct 30, 2020 at 6:57 PM Steven Wu wrote:
> not a solution, but a potential workaround. Maybe rename the operator uid
> so that you can continue to leverag
not a solution, but a potential workaround. Maybe rename the operator uid
so that you can continue to leverage allowNonRestoredState?
On Thu, Oct 29, 2020 at 7:58 AM Peter Westermann
wrote:
> Does that actually allow removing a state completely (vs. just modifying
> the values stored in state)?
Hello,
I see that in my class path (below) I have both log4j-1 and lo4j-api-2. is
this because of which i'm not seeing any logs. If so, could someone suggest
how to fix it?
export
CLASSPATH=":lib/flink-csv-1.11.0.jar:lib/flink-json-1.11.0.jar:lib/flink-shaded-zookeeper-3.4.14.jar:lib/flink-table-
No, you would do:
CREATE TABLE table1(
`text` VARCHAR, -- each CSV row is just a single text column
`path` VARCHAR METADATA VIRTUAL, -- where path is a name declared
by the filesystem source
`size` VARCHAR METADATA VIRTUAL -- where size is a name declared
by the filesystem s
I've written [FLINK-19903][1].
I just read [FLIP-107][2] and [FLINK-15869][3] and I need to ask
So assuming that FLIP-107 / FLINK-15869 is implemented and Filesystem SQL
connector modified to expose metadata (including path, and possible other
stuff) , then to use it I would need to write:
Sure, I’ll write the JIRA issue
On Fri, 30 Oct 2020 at 13:27, Dawid Wysakowicz
wrote:
> I am afraid there is no such functionality available yet.
>
> I think though it is a valid request. I think we can use the upcoming
> FLIP-107 metadata columns for this purpose and expose the file name as
> m
I am afraid there is no such functionality available yet.
I think though it is a valid request. I think we can use the upcoming
FLIP-107 metadata columns for this purpose and expose the file name as
metadata column of a filesystem source.
Would you like to create a JIRA issue for it?
Best,
Dawi
I've asked this already on [stackoverflow][1]
Is there anything equivalent to Spark's `f.input_file_name()` ? I
don't see anything that could be used in [system functions][2]
I have a dataset where they embedded some information in the filenames
(200k files) and I need to extract that as a new c
You are Right Chesnay but I'm doing this stuff in parallel with other 2
things and I messed up the jar name, sorry for that.
For the code after env.execute I'll try to use the new JobListener
interface next days.. I hope it could be sufficient (I just have to call an
external service to update the
Yes, it is definitely way easier to upload&run jars instead of
submitting JobGraphs.
But I thought this was not viable for you because you cannot execute
anything after env.execute()? I believe this limitation still exists. Or
are you referring here to error-handling in case env.execute() thro
I just discovered that I was using the "slim" jar instead of the "fat"
one...sorry. Now I'm able to successfully run the program on the remote
cluster.
However, the fact of generating the job graph on the client side it's
something I really don't like at allbecause it requires access both to
flink
Very nice Dawid, thanks! I'll give this a try next time I run the tests.
Regards,
Juha
El vie., 30 oct. 2020 a las 13:13, Dawid Wysakowicz ()
escribió:
> Small correction to my previous email. The previously mentioned problem
> is actually not a problem. You can just pass the log4j.configuration
Small correction to my previous email. The previously mentioned problem
is actually not a problem. You can just pass the log4j.configurationFile
explicitly:
mvn '-Dlog4j.configurationFile=[path]/log4j2-on.properties' clean install
Best,
Dawid
On 23/10/2020 09:48, Juha Mynttinen wrote:
> Hey the
You should be able to globally override the configuration file used by
surefire plugin which executes tests like this:
mvn '-Dlog4j.configuration=[path]/log4j2-on.properties' clean install
Bear in mind there is a minor bug in our surefire configuration now:
https://issues.apache.org/jira/browse/F
Can you give me more information on your packaging setup / project
structure? Is "it.test.MyOb" a test class? Does the dependency
containing this class have a "test" scope?
On 10/30/2020 11:34 AM, Chesnay Schepler wrote:
It is irrelevant whether it contains transitive dependencies or not;
that
It is irrelevant whether it contains transitive dependencies or not;
that's a maven concept, not a classloading one.
The WordCount main class, which is only contained in that jar, could be
found, so the classloading is working. If any other class that is
supposed to be in jar cannot be found,
Hi Dawid, thx for your reply and sorry for a question with a double
interpretation !
You understood me correctly, I would like to get counters value by their
names after completing all operations with the harness component.
I suppose that it should be useful because most of our functions are
impl
Yes, with the WordCount it works but that jar is not a "fat" jar (it does
not include transitive dependencies).
Actually I was able to use the REST API without creating the JobGraph, you
just have to tell the run API the jar id, the main cluss to run and the
optional parameters.
For this don't use
If you aren't setting up the classpath correctly then you cannot create
a JobGraph, and cannot use the REST API (outside of uploading jars).
In other words, you _will_ have to solve this issue, one way or another.
FYI, I just tried your code to submit a WordCount jar to a cluster (the
one in th
Hi Rinat,
First of all, sorry for some nitpicking in the beginning, but your
message might be a bit misleading for some. If I understood your message
correctly you are referring to Metrics, not accumulators, which are a
different concept[1]. Or were you indeed referring to accumulators?
Now on to
For "REST only client" I mean using only the REST API to interact with the
Flink cluster, i.e. without creating any PackagedProgram and thus incurring
into classpath problems.
I've implemented a running job server that was using the REST API to upload
the job jar and execute the run command but the
To clarify, if the job creation fails on the JM side, in 1.11 the job
submission will fail, in 1.12 it will succeed but the job will be in a
failed state.
On 10/30/2020 10:23 AM, Chesnay Schepler wrote:
1) the job still reaches a failed state, which you can poll for, see 2)
2) polling is your
1) the job still reaches a failed state, which you can poll for, see 2)
2) polling is your only way.
What do you mean with "REST only client"? Do you mean a plain http
client, not something that Flink provides?
On 10/30/2020 10:02 AM, Flavio Pompermaier wrote:
Nothing to do also with IntelliJ.
Hello,
I am wondering whether Flink operators synchronize their execution
states like Apache Spark. In Apache Spark, the master decides
everything, for example, it schedules jobs and assigns tasks to
Executors so that each job is executed in a synchronized way. But Flink
looks different. It a
Nothing to do also with IntelliJ..do you have any sample project I
can reuse to test the job submission to a cluster?
I can't really understand why the classes within the fat jar are not found
when generating the PackagedProgram.
Ideally, I'd prefer to use REST only client (so no need to build pack
Hi,
The problem in your case is that you exit before anything is printed
out. The method executeInsert executes the query, but it does not wait
for the query to finish. Therefore your main/test method returns,
bringing down the local cluster, before anything is printed out. You can
e.g. add
Hi Sunitha,
you probably need to apply a non-windowed grouping.
datastream
.keyBy(Event::getVehicleId)
.reduce((first, other) -> first);
This example will always throw away the second record. You may want to
combine the records though by summing up the fuel.
Best,
Arvid
On Wed, Oct 28,
Hi Thilo,
the number of required network buffers depends on your data exchanges and
parallelism.
For each shuffling data exchange (what you need for group by), you ideally
have #slots-per-TM^2 * #TMs * 4 buffers.
So I'm assuming you have 11 machines and 8 slots per machine. Then for best
performa
Hi Schneider,
The error message suggests that your task managers are not configured with
enough network memory. You would need to increase the network memory
configuration. See this doc [1] for more details.
Thank you~
Xintong Song
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.
33 matches
Mail list logo