[jira] [Created] (FLINK-4668) Fix positive random int generation
Alexander Pivovarov created FLINK-4668: -- Summary: Fix positive random int generation Key: FLINK-4668 URL: https://issues.apache.org/jira/browse/FLINK-4668 Project: Flink Issue Type: Bug Components: Client Reporter: Alexander Pivovarov Priority: Trivial According to java spec {code}Math.abs(Integer.MIN_VALUE) == Integer.MIN_VALUE{code} So, {code}Math.abs(rnd.nextInt()){code} might return negative value To generate positive random int value we can use {code}rnd.nextInt(Integer.MAX_VALUE){code} Integer.MAX_VALUE will be excluded btw -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4667) Yarn Session CLI not listening on correct ZK namespace when HA is enabled to use ZooKeeper backend
Vijay Srinivasaraghavan created FLINK-4667: -- Summary: Yarn Session CLI not listening on correct ZK namespace when HA is enabled to use ZooKeeper backend Key: FLINK-4667 URL: https://issues.apache.org/jira/browse/FLINK-4667 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Vijay Srinivasaraghavan Assignee: Vijay Srinivasaraghavan Priority: Minor In Yarn mode, when Flink is configured for HA using ZooKeeper backend, the leader election listener does not provide correct JM/leader info and will timeout since the listener is waiting on default ZK namespace instead of the application specific (Application ID) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4666) Make constants final in ParameterTool
Alexander Pivovarov created FLINK-4666: -- Summary: Make constants final in ParameterTool Key: FLINK-4666 URL: https://issues.apache.org/jira/browse/FLINK-4666 Project: Flink Issue Type: Bug Components: Java API Reporter: Alexander Pivovarov Priority: Trivial NO_VALUE_KEY and DEFAULT_UNDEFINED in ParameterTool should be final -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4665) Remove boxing/unboxing to parse a primitive
Alexander Pivovarov created FLINK-4665: -- Summary: Remove boxing/unboxing to parse a primitive Key: FLINK-4665 URL: https://issues.apache.org/jira/browse/FLINK-4665 Project: Flink Issue Type: Bug Components: Java API Reporter: Alexander Pivovarov Priority: Trivial I found the following issues with boxing/unboxing and Integer 1. Current code doing boxing/unboxing to parse a primitive - It is more efficient to just call the static parseXXX method. 2. boxing/unboxing to do type cast 3. new Integer instead of valueOf - Using new Integer(int) is guaranteed to always result in a new object whereas Integer.valueOf(int) allows caching of values to be done by the compiler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Flink Scala - data load from CSV to postgress database.
Hi Guys, We have requirement like – loading data from local CSV file to Postgress database using Flink Scala… Do you have any sample Flink scala code for this? We have tried and searched in Google/Flinkweb website for data load, we haven’t found any sample code for this requisite. Code: Flink Scala. Load: from CSV to local postgres database. Thanks Jagan On 22 September 2016 at 20:37, Jagan wrote: > Hi Team, > > Will you be able to guide me on this? Is this a known issue that we > can't implement dataload in flink scala ? > >data load from csv to postgress or any relational database in Flink > Scala > > Thanks > > Jagan. > > On 22 September 2016 at 20:15, Jagan wrote: > >> Thanks Suneel, >> >> but client want to implement the data load in Flink Scala.. >> >> >> On 22 September 2016 at 20:07, Suneel Marthi wrote: >> >>> Couldn't u use SQLLoader or something for doing that? >>> >>> http://stackoverflow.com/questions/2987433/how-to-import-csv >>> -file-data-into-a-postgresql-table >>> >>> >>> >>> On Thu, Sep 22, 2016 at 3:01 PM, Jagan wrote: >>> >>> > Hi Guys, >>> > >>> > We have a requirement like – loading data from local CSV file to >>> Postgress >>> > database using Flink Scala…We have tried number of ways all failed >>> > >>> > Do you have any example for this? With dependency libraries to >>> understand >>> > how to load data from CSV to postgres >>> > >>> > We have tried and searched in Google/Flinkweb website for data load, we >>> > haven’t found any sample code for this requisite. >>> > >>> > Code: Flink Scala. >>> > >>> > Load: from CSV to local postgres database. >>> > >>> > Thanks >>> > >>> > Jagan >>> > >>> > 0044-7411239688 >>> > >>> > >>> > >>> > -- >>> > Regards. >>> > Jagan. >>> > >>> > >>> > - >>> > * The Good You Do Today, People will often forget tomorrow: Do >>> Good >>> > Anyway >>> > >>> >> >> >> >> -- >> Regards. >> Jagan. >> >> >> - >> * The Good You Do Today, People will often forget tomorrow: Do Good >> Anyway >> > > > > -- > Regards. > Jagan. > > > - > * The Good You Do Today, People will often forget tomorrow: Do Good > Anyway > -- Regards. Jagan. - * The Good You Do Today, People will often forget tomorrow: Do Good Anyway
Flink Accumulators vs Metrics
Hi All Based on my code reading, I have following understanding of the Metrics and Accumulators. 1. Accumulators for a Flink JOB work like global counters. They are designed so that accumulator values from different instances of Execution Vertex can be combined. They are essentially distributed counters. 2. Flink Metrics are local to Task Manager which is reporting those, and need external aggregation for a Job centric view I see that one can defined User Metrics as part of writing Flink Programs. But these metrics would not be consolidated when the job is running same task on different task managers. Having said that, Is it fair to classify that Metrics are for surfacing operation details only, and would not be replacing Accumulators anytime. For my use case, I wanted to maintain some Global counters/ histograms. ( like the one available in Storm - e.g. Total Messages Processed in last 1 minute, last 10 minutes etc). Metrics would have been perfect fit for these but one would need to employ external aggregations to come up with holistic view of metrics at JOB level. Please correct my understanding if i am missing something here. Regards Sumit Chawla
Re: Get Flink ExecutionGraph Programmatically
HI Aljoscha I was able to get the ClusterClient and Accumulators using following: DefaultCLI defaultCLI = new DefaultCLI(); CommandLine line = new DefaultParser().parse(new Options(), new String[]{}, true); ClusterClient clusterClient = defaultCLI.retrieveCluster(line,configuration); Regards Sumit Chawla On Thu, Sep 22, 2016 at 4:55 AM, Aljoscha Krettek wrote: > Hi, > there is ClusterClient.getAccumulators(JobID jobID) which should be able > to > get the accumulators for a running job. If you can construct a > ClusterClient that should be a good solution. > > Cheers, > Aljoscha > > On Wed, 21 Sep 2016 at 21:15 Chawla,Sumit wrote: > > > Hi Sean > > > > My goal here is to get User Accumulators. I know there exists the REST > > Calls. But since i am running my code in the same JVM, i wanted to avoid > > go over HTTP. I saw this code in JobAccumulatorsHandler and tried to use > > this. Would you suggest some alternative approach to avoid this over the > > network serialization for Akka? > > > > Regards > > Sumit Chawla > > > > > > On Wed, Sep 21, 2016 at 11:37 AM, Stephan Ewen wrote: > > > > > Between two different actor systems in the same JVM, messages are still > > > serialized (they go through a local socket, I think). > > > > > > Getting the execution graph is not easily possible, and not intended, > as > > it > > > actually contains RPC resources, etc. > > > > > > What do you need from the execution graph? Maybe there is another way > to > > > achieve that... > > > > > > On Wed, Sep 21, 2016 at 8:08 PM, Chawla,Sumit > > > wrote: > > > > > > > Hi Chesney > > > > > > > > I am actually running this code in the same JVM as the WebInterface > and > > > > JobManager. I am programmatically, starting the JobManager. and > then > > > > running this code in same JVM to query metrics. Only difference > could > > be > > > > that i am creating a new Akka ActorSystem, and ActorGateway. Not sure > > if > > > it > > > > forces it to execute the code as if request is coming over the > wire. I > > > am > > > > not very well aware of Akka internals, so may be somebody can shed > some > > > > light on it. > > > > > > > > Regards > > > > Sumit Chawla > > > > > > > > > > > > On Wed, Sep 21, 2016 at 1:06 AM, Chesnay Schepler < > ches...@apache.org> > > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > this is a rather subtle issue you stumbled upon here. > > > > > > > > > > The ExecutionGraph is not serializable. The only reason why the > > > > > WebInterface can access it is because it runs in the same JVM as > the > > > > > JobManager. > > > > > > > > > > I'm not sure if there is a way for what you are trying to do. > > > > > > > > > > Regards, > > > > > Chesnay > > > > > > > > > > > > > > > On 21.09.2016 06:11, Chawla,Sumit wrote: > > > > > > > > > >> Hi All > > > > >> > > > > >> > > > > >> I am trying to get JOB accumulators. ( I am aware that I can get > > the > > > > >> accumulators through REST APIs as well, but i wanted to avoid JSON > > > > >> parsing). > > > > >> > > > > >> Looking at JobAccumulatorsHandler i am trying to get execution > graph > > > for > > > > >> currently running job. Following is my code: > > > > >> > > > > >>InetSocketAddress initialJobManagerAddress=new > > > > >> InetSocketAddress(hostName,port); > > > > >> InetAddress ownHostname; > > > > >> ownHostname= > > > > >> ConnectionUtils.findConnectingAddress( > initialJobManagerAddress,2000, > > > > 400); > > > > >> > > > > >> ActorSystem actorSystem= > AkkaUtils.createActorSystem(co > > > > >> nfiguration, > > > > >> new Some(new > > > > >> Tuple2(ownHostname.getCanonicalHostName(),0))); > > > > >> > > > > >> FiniteDuration timeout= FiniteDuration.apply(10, > > > > >> TimeUnit.SECONDS); > > > > >> > > > > >> ActorGateway akkaActorGateway= > > > > >> LeaderRetrievalUtils.retrieveLeaderGateway( > > > > >> > > > > >> LeaderRetrievalUtils.createLeaderRetrievalService(configuration), > > > > >> actorSystem,timeout > > > > >> ); > > > > >> > > > > >> > > > > >> Future future=akkaActorGateway.ask(new > > > > >> RequestJobDetails(true,false),timeout); > > > > >> > > > > >> MultipleJobsDetails result=(MultipleJobsDetails) > > > > >> Await.result(future,timeout); > > > > >> ExecutionGraphHolder executionGraphHolder=new > > > > >> ExecutionGraphHolder(timeout); > > > > >> LOG.info(result.toString()); > > > > >> for(JobDetails detail:result.getRunningJobs()){ > > > > >> LOG.info(detail.getJobName() + " ID " + > > > > >> detail.getJobId()); > > > > >> > > > > >> *ExecutionGraph > > > > >> executionGraph=executionGraphHolder.getExecutionGraph(detail. > > > > getJobId(), > > > > >> akkaActorGateway);* > > > > >> > > > > >> LOG.info("Accumulators " + > > > > >> executionGraph.aggregateUserAccumulators(
[jira] [Created] (FLINK-4664) Add translator to NullValue
Greg Hogan created FLINK-4664: - Summary: Add translator to NullValue Key: FLINK-4664 URL: https://issues.apache.org/jira/browse/FLINK-4664 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 1.2.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Minor Fix For: 1.2.0 Existing translators convert from LongValue (the output label type of graph generators) to IntValue, StringValue, and an offset LongValue. Translators can also be used to convert vertex or edge values. This translator will be appropriate for translating these vertex or edge values to NullValue when the values are not used in an algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
FailureRate Restart Strategy is not picked from Config file
Hi All, I tried to use FailureRate restart strategy by setting values for it in flink-conf.yaml but flink (v 1.1.2) did not pick it up. # Flink Restart strategy restart-strategy: failure-rate restart-strategy.failure-rate.delay: 120 s restart-strategy.failure-rate.failure-rate-interval: 12 minute restart-strategy.failure-rate.max-failures-per-interval: 300 It works when I set it up explicitly in topology using *env.setRestartStrategy * PFA snapshot of the Jobmanager log. Thanks, Deepak Jha
Flink Scala data load from CSV to postgress database.
Hi Guys, We have requirement like – loading data from local CSV file to Postgress database using Flink Scala…We have tried number of ways all faile Do you have any example for this? With dependency libraries to understand how to load data from CSV to postgres We have tried and searched in Google/Flinkweb website for data load, we haven’t found any sample code for this requisite. Code: Flink Scala. Load: from CSV to local postgres database. Thanks Jagan 0044-7411239688
Re: Flink Scala - data load from CSV to postgress database.
Hi Team, Will you be able to guide me on this? Is this a known issue that we can't implement dataload in flink scala ? data load from csv to postgress or any relational database in Flink Scala Thanks Jagan. On 22 September 2016 at 20:15, Jagan wrote: > Thanks Suneel, > > but client want to implement the data load in Flink Scala.. > > > On 22 September 2016 at 20:07, Suneel Marthi wrote: > >> Couldn't u use SQLLoader or something for doing that? >> >> http://stackoverflow.com/questions/2987433/how-to-import- >> csv-file-data-into-a-postgresql-table >> >> >> >> On Thu, Sep 22, 2016 at 3:01 PM, Jagan wrote: >> >> > Hi Guys, >> > >> > We have a requirement like – loading data from local CSV file to >> Postgress >> > database using Flink Scala…We have tried number of ways all failed >> > >> > Do you have any example for this? With dependency libraries to >> understand >> > how to load data from CSV to postgres >> > >> > We have tried and searched in Google/Flinkweb website for data load, we >> > haven’t found any sample code for this requisite. >> > >> > Code: Flink Scala. >> > >> > Load: from CSV to local postgres database. >> > >> > Thanks >> > >> > Jagan >> > >> > 0044-7411239688 >> > >> > >> > >> > -- >> > Regards. >> > Jagan. >> > >> > >> > - >> > * The Good You Do Today, People will often forget tomorrow: Do Good >> > Anyway >> > >> > > > > -- > Regards. > Jagan. > > > - > * The Good You Do Today, People will often forget tomorrow: Do Good > Anyway > -- Regards. Jagan. - * The Good You Do Today, People will often forget tomorrow: Do Good Anyway
Re: Flink Scala - data load from CSV to postgress database.
Thanks Suneel, but client want to implement the data load in Flink Scala.. On 22 September 2016 at 20:07, Suneel Marthi wrote: > Couldn't u use SQLLoader or something for doing that? > > http://stackoverflow.com/questions/2987433/how-to- > import-csv-file-data-into-a-postgresql-table > > > > On Thu, Sep 22, 2016 at 3:01 PM, Jagan wrote: > > > Hi Guys, > > > > We have a requirement like – loading data from local CSV file to > Postgress > > database using Flink Scala…We have tried number of ways all failed > > > > Do you have any example for this? With dependency libraries to understand > > how to load data from CSV to postgres > > > > We have tried and searched in Google/Flinkweb website for data load, we > > haven’t found any sample code for this requisite. > > > > Code: Flink Scala. > > > > Load: from CSV to local postgres database. > > > > Thanks > > > > Jagan > > > > 0044-7411239688 > > > > > > > > -- > > Regards. > > Jagan. > > > > > > - > > * The Good You Do Today, People will often forget tomorrow: Do Good > > Anyway > > > -- Regards. Jagan. - * The Good You Do Today, People will often forget tomorrow: Do Good Anyway
Re: Flink Scala - data load from CSV to postgress database.
Couldn't u use SQLLoader or something for doing that? http://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table On Thu, Sep 22, 2016 at 3:01 PM, Jagan wrote: > Hi Guys, > > We have a requirement like – loading data from local CSV file to Postgress > database using Flink Scala…We have tried number of ways all failed > > Do you have any example for this? With dependency libraries to understand > how to load data from CSV to postgres > > We have tried and searched in Google/Flinkweb website for data load, we > haven’t found any sample code for this requisite. > > Code: Flink Scala. > > Load: from CSV to local postgres database. > > Thanks > > Jagan > > 0044-7411239688 > > > > -- > Regards. > Jagan. > > > - > * The Good You Do Today, People will often forget tomorrow: Do Good > Anyway >
Fwd: Flink Scala - data load from CSV to postgress database.
Hi Guys, We have a requirement like – loading data from local CSV file to Postgress database using Flink Scala…We have tried number of ways all failed Do you have any example for this? With dependency libraries to understand how to load data from CSV to postgres We have tried and searched in Google/Flinkweb website for data load, we haven’t found any sample code for this requisite. Code: Flink Scala. Load: from CSV to local postgres database. Thanks Jagan 0044-7411239688 -- Regards. Jagan. - * The Good You Do Today, People will often forget tomorrow: Do Good Anyway
Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK
Not to derail this thread onto another topic but the problem with using a static instance is that there's no way to shut it down when the job stops. So if, for example, it starts threads, I don't think those threads will stop when the job stops. I'm not very well versed in how various Java 8 implementations perform unloading of classloaders & class definitions/statics therein, but it seems problematic unless the job provides a shutdown hook to which user code can subscribe. On 9/21/16, 8:05 PM, "David Wang" wrote: >Hi Shannon, > >That's right. This FLIP aims to boost TPS of the task workers with async >i/o operation. > >As what Stephan has mentioned, by placing static attribute to shared >resources(like event pool, connection), it is possible to share those >resources among different slots in the same JVM. > >I will make a note in the FLIP about how to share resources ;D > >Thanks, >David > >2016-09-22 1:46 GMT+08:00 Stephan Ewen : > >> @Shannon: One could have a "static" broker to share the same netty across >> slots in the same JVM. Implicitly, Flink does the same with broadcast >> variables. >> >> On Wed, Sep 21, 2016 at 5:04 PM, Shannon Carey wrote: >> >> > David, >> > >> > I just wanted to say "thanks" for making this proposal! I'm also >> > interested in performing nonblocking I/O (multiplexing threads/reactive >> > programming) within Flink operators so that we can, for example, >> > communicate with external web services with Netty/RxNetty without >> blocking >> > an entire Flink slot (aka a thread) while we wait for the operation to >> > complete. It looks like your FLIP will enable that use case. >> > >> > I'm not sure whether it will be possible to share one Netty >> EventLoopGroup >> > (or the equivalent for any other non-blocking framework, connection pool, >> > etc.) among multiple slots in a single JVM though. Flink supports >> > open/close operation on a RichFunction, but that's on a per-slot basis. I >> > don't know of a way to open/close objects on a per-job-JVM basis. But I >> > suppose that's an issue that should be discussed and resolved separately. >> > >> > -Shannon >> > >> > >>
Re: [DISCUSS] Merge batch and stream connector modules
+1 for Fabian's suggestion On Thu, Sep 22, 2016 at 3:25 PM, Swapnil Chougule wrote: > +1 > It will be good to have one module flink-connectors (union of streaming and > batch connectors). > > Regards, > Swapnil > > On Thu, Sep 22, 2016 at 6:35 PM, Fabian Hueske wrote: > > > Hi everybody, > > > > right now, we have two separate Maven modules for batch and streaming > > connectors (flink-batch-connectors and flink-streaming-connectors) that > > contain modules for the individual external systems and storage formats > > such as HBase, Cassandra, Avro, Elasticsearch, etc. > > > > Some of these systems can be used in streaming as well as batch jobs as > for > > instance HBase, Cassandra, and Elasticsearch. However, due to the > separate > > main modules for streaming and batch connectors, we currently need to > > decide where to put a connector. For example, the > flink-connector-cassandra > > module is located in flink-streaming-connectors but includes a > > CassandraInputFormat and CassandraOutputFormat (i.e., a batch source and > > sink). > > > > In my opinion, it would be better to just merge flink-batch-connectors > and > > flink-streaming-connectors into a joint flink-connectors module. > > > > This would be only an internal restructuring of code and not be visible > to > > users (unless we change the module names of the individual connectors > which > > is not necessary, IMO). > > > > What do others think? > > > > Best, Fabian > > >
Re: [DISCUSS] Merge batch and stream connector modules
+1 It will be good to have one module flink-connectors (union of streaming and batch connectors). Regards, Swapnil On Thu, Sep 22, 2016 at 6:35 PM, Fabian Hueske wrote: > Hi everybody, > > right now, we have two separate Maven modules for batch and streaming > connectors (flink-batch-connectors and flink-streaming-connectors) that > contain modules for the individual external systems and storage formats > such as HBase, Cassandra, Avro, Elasticsearch, etc. > > Some of these systems can be used in streaming as well as batch jobs as for > instance HBase, Cassandra, and Elasticsearch. However, due to the separate > main modules for streaming and batch connectors, we currently need to > decide where to put a connector. For example, the flink-connector-cassandra > module is located in flink-streaming-connectors but includes a > CassandraInputFormat and CassandraOutputFormat (i.e., a batch source and > sink). > > In my opinion, it would be better to just merge flink-batch-connectors and > flink-streaming-connectors into a joint flink-connectors module. > > This would be only an internal restructuring of code and not be visible to > users (unless we change the module names of the individual connectors which > is not necessary, IMO). > > What do others think? > > Best, Fabian >
[DISCUSS] Merge batch and stream connector modules
Hi everybody, right now, we have two separate Maven modules for batch and streaming connectors (flink-batch-connectors and flink-streaming-connectors) that contain modules for the individual external systems and storage formats such as HBase, Cassandra, Avro, Elasticsearch, etc. Some of these systems can be used in streaming as well as batch jobs as for instance HBase, Cassandra, and Elasticsearch. However, due to the separate main modules for streaming and batch connectors, we currently need to decide where to put a connector. For example, the flink-connector-cassandra module is located in flink-streaming-connectors but includes a CassandraInputFormat and CassandraOutputFormat (i.e., a batch source and sink). In my opinion, it would be better to just merge flink-batch-connectors and flink-streaming-connectors into a joint flink-connectors module. This would be only an internal restructuring of code and not be visible to users (unless we change the module names of the individual connectors which is not necessary, IMO). What do others think? Best, Fabian
Re: Get Flink ExecutionGraph Programmatically
Hi, there is ClusterClient.getAccumulators(JobID jobID) which should be able to get the accumulators for a running job. If you can construct a ClusterClient that should be a good solution. Cheers, Aljoscha On Wed, 21 Sep 2016 at 21:15 Chawla,Sumit wrote: > Hi Sean > > My goal here is to get User Accumulators. I know there exists the REST > Calls. But since i am running my code in the same JVM, i wanted to avoid > go over HTTP. I saw this code in JobAccumulatorsHandler and tried to use > this. Would you suggest some alternative approach to avoid this over the > network serialization for Akka? > > Regards > Sumit Chawla > > > On Wed, Sep 21, 2016 at 11:37 AM, Stephan Ewen wrote: > > > Between two different actor systems in the same JVM, messages are still > > serialized (they go through a local socket, I think). > > > > Getting the execution graph is not easily possible, and not intended, as > it > > actually contains RPC resources, etc. > > > > What do you need from the execution graph? Maybe there is another way to > > achieve that... > > > > On Wed, Sep 21, 2016 at 8:08 PM, Chawla,Sumit > > wrote: > > > > > Hi Chesney > > > > > > I am actually running this code in the same JVM as the WebInterface and > > > JobManager. I am programmatically, starting the JobManager. and then > > > running this code in same JVM to query metrics. Only difference could > be > > > that i am creating a new Akka ActorSystem, and ActorGateway. Not sure > if > > it > > > forces it to execute the code as if request is coming over the wire. I > > am > > > not very well aware of Akka internals, so may be somebody can shed some > > > light on it. > > > > > > Regards > > > Sumit Chawla > > > > > > > > > On Wed, Sep 21, 2016 at 1:06 AM, Chesnay Schepler > > > wrote: > > > > > > > Hello, > > > > > > > > this is a rather subtle issue you stumbled upon here. > > > > > > > > The ExecutionGraph is not serializable. The only reason why the > > > > WebInterface can access it is because it runs in the same JVM as the > > > > JobManager. > > > > > > > > I'm not sure if there is a way for what you are trying to do. > > > > > > > > Regards, > > > > Chesnay > > > > > > > > > > > > On 21.09.2016 06:11, Chawla,Sumit wrote: > > > > > > > >> Hi All > > > >> > > > >> > > > >> I am trying to get JOB accumulators. ( I am aware that I can get > the > > > >> accumulators through REST APIs as well, but i wanted to avoid JSON > > > >> parsing). > > > >> > > > >> Looking at JobAccumulatorsHandler i am trying to get execution graph > > for > > > >> currently running job. Following is my code: > > > >> > > > >>InetSocketAddress initialJobManagerAddress=new > > > >> InetSocketAddress(hostName,port); > > > >> InetAddress ownHostname; > > > >> ownHostname= > > > >> ConnectionUtils.findConnectingAddress(initialJobManagerAddress,2000, > > > 400); > > > >> > > > >> ActorSystem actorSystem= AkkaUtils.createActorSystem(co > > > >> nfiguration, > > > >> new Some(new > > > >> Tuple2(ownHostname.getCanonicalHostName(),0))); > > > >> > > > >> FiniteDuration timeout= FiniteDuration.apply(10, > > > >> TimeUnit.SECONDS); > > > >> > > > >> ActorGateway akkaActorGateway= > > > >> LeaderRetrievalUtils.retrieveLeaderGateway( > > > >> > > > >> LeaderRetrievalUtils.createLeaderRetrievalService(configuration), > > > >> actorSystem,timeout > > > >> ); > > > >> > > > >> > > > >> Future future=akkaActorGateway.ask(new > > > >> RequestJobDetails(true,false),timeout); > > > >> > > > >> MultipleJobsDetails result=(MultipleJobsDetails) > > > >> Await.result(future,timeout); > > > >> ExecutionGraphHolder executionGraphHolder=new > > > >> ExecutionGraphHolder(timeout); > > > >> LOG.info(result.toString()); > > > >> for(JobDetails detail:result.getRunningJobs()){ > > > >> LOG.info(detail.getJobName() + " ID " + > > > >> detail.getJobId()); > > > >> > > > >> *ExecutionGraph > > > >> executionGraph=executionGraphHolder.getExecutionGraph(detail. > > > getJobId(), > > > >> akkaActorGateway);* > > > >> > > > >> LOG.info("Accumulators " + > > > >> executionGraph.aggregateUserAccumulators()); > > > >> } > > > >> > > > >> > > > >> However, i am receiving following error in Flink: > > > >> > > > >> 2016-09-20T14:19:53,201 [flink-akka.actor.default-dispatcher-3] > > nobody > > > >> ERROR akka.remote.EndpointWriter - Transient association error > > > >> (association > > > >> remains live) > > > >> java.io.NotSerializableException: org.apache.flink.runtime. > > checkpoint. > > > >> CheckpointCoordinator > > > >> at java.io.ObjectOutputStream.writeObject0( > > ObjectOutputStream. > > > >> java:1184) > > > >> ~[?:1.8.0_92] > > > >> at java.io.ObjectOutputStream.defaultWriteFields( > > > ObjectOutputSt > > > >> ream.java:1548
[jira] [Created] (FLINK-4663) Flink JDBCOutputFormat logs wrong WARN message
Swapnil Chougule created FLINK-4663: --- Summary: Flink JDBCOutputFormat logs wrong WARN message Key: FLINK-4663 URL: https://issues.apache.org/jira/browse/FLINK-4663 Project: Flink Issue Type: Bug Components: Batch Connectors and Input/Output Formats Affects Versions: 1.1.2, 1.1.1 Environment: Across Platform Reporter: Swapnil Chougule Fix For: 1.1.3 Flink JDBCOutputFormat logs wrong WARN message as "Column SQL types array doesn't match arity of passed Row! Check the passed array..." even if there is no mismatch is SQL types array & arity of passed Row. (flink/flink-batch-connectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/JDBCOutputFormat.java) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: On (FLINK-1526) JIRA issue
Exactly :) That's why we haven't added neither the spanning tree nor the strongly connected components algorithms yet. On Sep 22, 2016 12:16 PM, "Stephan Ewen" wrote: > Just as a general comment: > > A program with nested loops is most likely not going to be performant on > any way. It makes sense to re-think the algorithm, come up with a modified > or different pattern, rather than trying to implement the exact algorithm > line by line. > > It may be worth checking that, because I am not sure if Gelly should have > algorithms that don't perform well. > > On Thu, Sep 22, 2016 at 11:40 AM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > > > Hi Olga, > > > > when you use mapEdges() or mapVertices() with generics, Flink cannot > > determine the type because of type erasure, like the exception says. > That's > > why we also provide methods that take the type information as a > parameter. > > You can use those to make the return type explicit. In your example, you > > should do something like the following (line 41): > > > > final TypeInformation longType = BasicTypeInfo.LONG_TYPE_INFO; > > final TypeInformation doubleType = BasicTypeInfo.DOUBLE_TYPE_ > INFO; > > Graph> graphOut = > > graph.mapEdges(new InitializeEdges(), new > > TupleTypeInfo(longType, longType, > > new TupleTypeInfo>(doubleType, > > longType,longType))); > > > > Regarding the nested loops, I am almost sure that you will face problems > if > > you try to experiment with large datasets. I haven't looked into your > code > > yet, but according to the JIRA discussion, we've faced this problem > before > > and afaik, this is still an issue. > > > > Cheers, > > -Vasia. > > > > On 22 September 2016 at 01:12, Olga Golovneva > wrote: > > > > > Hi Vasia, > > > > > > I have uploaded these tests on github: > > > https://github.com/OlgaGolovneva/MST/tree/master/tests > > > > > > I have also uploaded source code, but I'm still working on it: > > > https://github.com/OlgaGolovneva/MST/tree/master/src > > > > > > >I think you cannot add attachments to the mailing list. Could you > > upload > > > >your example somewhere and post a link here? I'm actually surprised > that > > > >the while-loop works without problems. > > > > > > I have run the program on several simple tests, and I was going to try > > > large datasets in the next few days. Please, let me know if this > approach > > > is wrong. > > > > > > Thanks, > > > Olga > > > > > > On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri < > > > vasilikikala...@gmail.com > > > > wrote: > > > > > > > Hi Olga, > > > > > > > > On 21 September 2016 at 18:50, Olga Golovneva > > > wrote: > > > > > > > > > Hi devs, > > > > > > > > > > I was working on (FLINK-1526) "Add Minimum Spanning Tree library > > > method > > > > > and example" issue. I've developed (Java) code that implements > > > > distributed > > > > > Boruvka's algorithm in Gelly library. I've run several tests and it > > > seems > > > > > to work fine, although I didn't test it on extremely large input > > graphs > > > > > yet, and I'm also trying to optimize my code. > > > > > Particularly, I have two main issues: > > > > > > > > > > 1. Nested loops. > > > > > I have to use nested loops, and I do not see the way to avoid them. > > As > > > > > they are currently not supported, I'm using Bulk Iterations inside > a > > > > > "classic" while loop. I've included in attachment simple example > > > > > MyNestedIterationExample that shows this issue. > > > > > > > > > > > > > I think you cannot add attachments to the mailing list. Could you > > upload > > > > your example somewhere and post a link here? I'm actually surprised > > that > > > > the while-loop works without problems. > > > > > > > > > > > > > > > > > > 2. For some reason I cannot create class that works with types with > > > > > generic variables in Tuple2(or Tuple3), thus my code does not > support > > > > > generic types. I also included simple example MyTuple3Example. Here > > is > > > > the > > > > > Exception I get: > > > > > "Exception in thread "main" org.apache.flink.api.common.functions. > > > > InvalidTypesException: > > > > > Type of TypeVariable 'EV' in 'class org.apache.flink.graph. > > > > > examples.MyTuple3Example$InitializeEdges' could not be determined. > > > This > > > > > is most likely a type erasure problem. The type extraction > currently > > > > > supports types with generic variables only in cases where all > > variables > > > > in > > > > > the return type can be deduced from the input type(s)." > > > > > > > > > > > > > Can you upload this example and link to it too? > > > > > > > > Thanks, > > > > -Vasia. > > > > > > > > > > > > > > > > > > I would really appreciate if someone could explain me know how to > > avoid > > > > > this Exception. Otherwise, I could submit my code for testing. > > > > > > > > > > Best regards, > > > > > Olga Golovneva > > > > > > > > > > > > > > >
Re: 答复: [discuss] merge module flink-yarn and flink-yarn-test
"flink-test-utils" contains, as the name says, utils for testing. Intended to be used by users in writing their own tests. "flink-tests" contains cross module tests, no user should ever need to have a dependency on that. They are different because users explicitly asked for test utils to be factored into a separate project. As an honest reply here: Setting up a project as huge as Flink need to take many things into account - Multiple languages (Java / Scala), with limitations of IDEs in mind - Dependency conflicts and much shading magic - Dependency matrices (multiple hadoop and scala versions) - Supporting earlier Java versions - clean scope differentiation, so users can reuse utils and testing code That simply requires some extra modules once in a while. Flink people have worked hard on coming up with a structure that serves the need of the production users and automated build/testing systems. These production user requests are most important to us, and sometimes, we need to take cuts in "beauty of directory structure" to help them. Constantly accusing the community of creating bad structures before even trying to understand the reasoning behind that does not come across as very friendly. Constantly accusing the community of sloppy work just because your laptop settings are incompatible with the default configuration likewise. I hope you understand that. On Thu, Sep 22, 2016 at 2:58 AM, shijinkui wrote: > Hi, Stephan > > Thanks for your reply. > > In my mind, Maven-shade-plugin and sbt-assembly both default exclude test > code for the fat jar. > > In fact, unit tests are use to test the main code, ensure our code logic > fit our expect . This is general convention. I think. Flink has be a top > apache project. We shouldn't be special. We're programmer, should be > professional. > > Even more, there are `flink-tes-utils-parent` and `flink-tests` module, > what's the relation between them. > > I have to ask why they are exist? Where is the start of such confusion > modules? > > I think we shouldn't do nothing for this. Code and design should be > comfortable. > > Thanks > > From Jinkui Shi > > -邮件原件- > 发件人: Stephan Ewen [mailto:se...@apache.org] > 发送时间: 2016年9月21日 22:19 > 收件人: dev@flink.apache.org > 主题: Re: [discuss] merge module flink-yarn and flink-yarn-test > > I would like Robert to comment on this. > > I think there was a reason to have different modules, which had again > something to do with the Maven Shade Plugin Dependencies and shading really > seem the trickiest thing in bigger Java/Scala projects ;-) > > On Wed, Sep 21, 2016 at 11:04 AM, shijinkui wrote: > > > Hi, All > > > > There too much module in the root. There are no necessary to separate > > the test code from sub-module. > > > > I never see such design: two modules, one is main code, the other is > > test code. > > > > Is there some special reason? > > > > From Jinkui Shi > > >
Re: On (FLINK-1526) JIRA issue
Just as a general comment: A program with nested loops is most likely not going to be performant on any way. It makes sense to re-think the algorithm, come up with a modified or different pattern, rather than trying to implement the exact algorithm line by line. It may be worth checking that, because I am not sure if Gelly should have algorithms that don't perform well. On Thu, Sep 22, 2016 at 11:40 AM, Vasiliki Kalavri < vasilikikala...@gmail.com> wrote: > Hi Olga, > > when you use mapEdges() or mapVertices() with generics, Flink cannot > determine the type because of type erasure, like the exception says. That's > why we also provide methods that take the type information as a parameter. > You can use those to make the return type explicit. In your example, you > should do something like the following (line 41): > > final TypeInformation longType = BasicTypeInfo.LONG_TYPE_INFO; > final TypeInformation doubleType = BasicTypeInfo.DOUBLE_TYPE_INFO; > Graph> graphOut = > graph.mapEdges(new InitializeEdges(), new > TupleTypeInfo(longType, longType, > new TupleTypeInfo>(doubleType, > longType,longType))); > > Regarding the nested loops, I am almost sure that you will face problems if > you try to experiment with large datasets. I haven't looked into your code > yet, but according to the JIRA discussion, we've faced this problem before > and afaik, this is still an issue. > > Cheers, > -Vasia. > > On 22 September 2016 at 01:12, Olga Golovneva wrote: > > > Hi Vasia, > > > > I have uploaded these tests on github: > > https://github.com/OlgaGolovneva/MST/tree/master/tests > > > > I have also uploaded source code, but I'm still working on it: > > https://github.com/OlgaGolovneva/MST/tree/master/src > > > > >I think you cannot add attachments to the mailing list. Could you > upload > > >your example somewhere and post a link here? I'm actually surprised that > > >the while-loop works without problems. > > > > I have run the program on several simple tests, and I was going to try > > large datasets in the next few days. Please, let me know if this approach > > is wrong. > > > > Thanks, > > Olga > > > > On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri < > > vasilikikala...@gmail.com > > > wrote: > > > > > Hi Olga, > > > > > > On 21 September 2016 at 18:50, Olga Golovneva > > wrote: > > > > > > > Hi devs, > > > > > > > > I was working on (FLINK-1526) "Add Minimum Spanning Tree library > > method > > > > and example" issue. I've developed (Java) code that implements > > > distributed > > > > Boruvka's algorithm in Gelly library. I've run several tests and it > > seems > > > > to work fine, although I didn't test it on extremely large input > graphs > > > > yet, and I'm also trying to optimize my code. > > > > Particularly, I have two main issues: > > > > > > > > 1. Nested loops. > > > > I have to use nested loops, and I do not see the way to avoid them. > As > > > > they are currently not supported, I'm using Bulk Iterations inside a > > > > "classic" while loop. I've included in attachment simple example > > > > MyNestedIterationExample that shows this issue. > > > > > > > > > > I think you cannot add attachments to the mailing list. Could you > upload > > > your example somewhere and post a link here? I'm actually surprised > that > > > the while-loop works without problems. > > > > > > > > > > > > > > 2. For some reason I cannot create class that works with types with > > > > generic variables in Tuple2(or Tuple3), thus my code does not support > > > > generic types. I also included simple example MyTuple3Example. Here > is > > > the > > > > Exception I get: > > > > "Exception in thread "main" org.apache.flink.api.common.functions. > > > InvalidTypesException: > > > > Type of TypeVariable 'EV' in 'class org.apache.flink.graph. > > > > examples.MyTuple3Example$InitializeEdges' could not be determined. > > This > > > > is most likely a type erasure problem. The type extraction currently > > > > supports types with generic variables only in cases where all > variables > > > in > > > > the return type can be deduced from the input type(s)." > > > > > > > > > > Can you upload this example and link to it too? > > > > > > Thanks, > > > -Vasia. > > > > > > > > > > > > > > I would really appreciate if someone could explain me know how to > avoid > > > > this Exception. Otherwise, I could submit my code for testing. > > > > > > > > Best regards, > > > > Olga Golovneva > > > > > > > > > >
[jira] [Created] (FLINK-4662) Bump Calcite version up to 1.9
Timo Walther created FLINK-4662: --- Summary: Bump Calcite version up to 1.9 Key: FLINK-4662 URL: https://issues.apache.org/jira/browse/FLINK-4662 Project: Flink Issue Type: Improvement Components: Table API & SQL Reporter: Timo Walther Calcite just released the 1.9 version. We should adopt it also in the Table API especially for FLINK-4294. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: On (FLINK-1526) JIRA issue
Hi Olga, when you use mapEdges() or mapVertices() with generics, Flink cannot determine the type because of type erasure, like the exception says. That's why we also provide methods that take the type information as a parameter. You can use those to make the return type explicit. In your example, you should do something like the following (line 41): final TypeInformation longType = BasicTypeInfo.LONG_TYPE_INFO; final TypeInformation doubleType = BasicTypeInfo.DOUBLE_TYPE_INFO; Graph> graphOut = graph.mapEdges(new InitializeEdges(), new TupleTypeInfo(longType, longType, new TupleTypeInfo>(doubleType, longType,longType))); Regarding the nested loops, I am almost sure that you will face problems if you try to experiment with large datasets. I haven't looked into your code yet, but according to the JIRA discussion, we've faced this problem before and afaik, this is still an issue. Cheers, -Vasia. On 22 September 2016 at 01:12, Olga Golovneva wrote: > Hi Vasia, > > I have uploaded these tests on github: > https://github.com/OlgaGolovneva/MST/tree/master/tests > > I have also uploaded source code, but I'm still working on it: > https://github.com/OlgaGolovneva/MST/tree/master/src > > >I think you cannot add attachments to the mailing list. Could you upload > >your example somewhere and post a link here? I'm actually surprised that > >the while-loop works without problems. > > I have run the program on several simple tests, and I was going to try > large datasets in the next few days. Please, let me know if this approach > is wrong. > > Thanks, > Olga > > On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com > > wrote: > > > Hi Olga, > > > > On 21 September 2016 at 18:50, Olga Golovneva > wrote: > > > > > Hi devs, > > > > > > I was working on (FLINK-1526) "Add Minimum Spanning Tree library > method > > > and example" issue. I've developed (Java) code that implements > > distributed > > > Boruvka's algorithm in Gelly library. I've run several tests and it > seems > > > to work fine, although I didn't test it on extremely large input graphs > > > yet, and I'm also trying to optimize my code. > > > Particularly, I have two main issues: > > > > > > 1. Nested loops. > > > I have to use nested loops, and I do not see the way to avoid them. As > > > they are currently not supported, I'm using Bulk Iterations inside a > > > "classic" while loop. I've included in attachment simple example > > > MyNestedIterationExample that shows this issue. > > > > > > > I think you cannot add attachments to the mailing list. Could you upload > > your example somewhere and post a link here? I'm actually surprised that > > the while-loop works without problems. > > > > > > > > > > 2. For some reason I cannot create class that works with types with > > > generic variables in Tuple2(or Tuple3), thus my code does not support > > > generic types. I also included simple example MyTuple3Example. Here is > > the > > > Exception I get: > > > "Exception in thread "main" org.apache.flink.api.common.functions. > > InvalidTypesException: > > > Type of TypeVariable 'EV' in 'class org.apache.flink.graph. > > > examples.MyTuple3Example$InitializeEdges' could not be determined. > This > > > is most likely a type erasure problem. The type extraction currently > > > supports types with generic variables only in cases where all variables > > in > > > the return type can be deduced from the input type(s)." > > > > > > > Can you upload this example and link to it too? > > > > Thanks, > > -Vasia. > > > > > > > > > > I would really appreciate if someone could explain me know how to avoid > > > this Exception. Otherwise, I could submit my code for testing. > > > > > > Best regards, > > > Olga Golovneva > > > > > >
[jira] [Created] (FLINK-4661) Failure to find org.apache.flink:flink-runtime_2.10:jar:tests
shijinkui created FLINK-4661: Summary: Failure to find org.apache.flink:flink-runtime_2.10:jar:tests Key: FLINK-4661 URL: https://issues.apache.org/jira/browse/FLINK-4661 Project: Flink Issue Type: Bug Reporter: shijinkui [ERROR] Failed to execute goal on project flink-streaming-java_2.10: Could not resolve dependencies for project org.apache.flink:flink-streaming-java_2.10:jar:1.2-SNAPSHOT: Failure to find org.apache.flink:flink-runtime_2.10:jar:tests:1.2-SNAPSHOT in http://localhost:/repository/maven-public/ was cached in the local repository, resolution will not be reattempted until the update interval of nexus-releases has elapsed or updates are forced -> [Help 1] Failure to find org.apache.flink:flink-runtime_2.10:jar:tests I can't find where this tests jar is generated. By the way, recently half month, I start to use flink. There is zero time I can compile the Flink project with default setting.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)