Thanks for your feedback. I agree on the main method "problem". For scanning and listing all stuff that is found it's fine.
The tricky question is the automatic invocation mechanism, if "-c" flag is not used, and no manifest program-class or Main-Class entry is found. If multiple classes implement "Program" interface an exception should be through (I think that would make sense). However, I am not sure was "good" behavior is, if a single "Program"-class is found and an additional main-method class. - should "Program"-class be executed (ie, "overwrite" main-method class) - or, better to through an exception ? If no "Program"-class is found, but a single main-method class, Flink could execute using main method. But I am not sure either, if this is "good" behavior. If multiple main-method classes are present, throwing and exception is the only way to got, I guess. To sum up: Should Flink consider main-method classes for automatic invocation, or should it be required for main-method classes to either list them in "program-class" or "Main-Class" manifest parameter (to enable them for automatic invocation)? -Matthias On 05/22/2015 09:56 AM, Maximilian Michels wrote: > Hi Matthias, > > Thank you for taking the time to analyze Flink's invocation behavior. I > like your proposal. I'm not sure whether it is a good idea to scan the > entire JAR for main methods. Sometimes, main methods are added solely for > testing purposes and don't really serve any practical use. However, if > you're already going through the JAR to find the ProgramDescription > interface, then you might look for main methods as well. As long as it is > just a listing without execution, that should be fine. > > Best regards, > Max > > On Thu, May 21, 2015 at 3:43 PM, Matthias J. Sax < > mj...@informatik.hu-berlin.de> wrote: > >> Hi, >> >> I had a look into the current Workflow of Flink with regard to the >> progressing steps of a jar file. >> >> If I got it right it works as follows (not sure if this is documented >> somewhere): >> >> 1) check, if "-c" flag is used to set program entry point >> if yes, goto 4 >> 2) try to extract "program-class" property from manifest >> (if found goto 4) >> 3) try to extract "Main-Class" property from manifest >> -> if not found through exception (this happens also, if no manifest >> file is found at all) >> >> 4) check if entry point class implements "Program" interface >> if yes, goto 6 >> 5) check if entry point class provided "public static void main(String[] >> args)" method >> -> if not, through exception >> >> 6) execute program (ie, show plan/info or really run it) >> >> >> I also "discovered" the interface "ProgramDescription" with a single >> method "String getDescription()". Even if some examples implement this >> interface (and use it in the example itself), Flink basically ignores >> it... From the CLI there is no way to get this info, and the WebUI does >> actually get it if present, however, doesn't show it anywhere... >> >> >> I think it would be nice, if we would extend the following functions: >> >> - extend the possibility to specify multiple entry classes in >> "program-class" or "Main-Class" -> in this case, the user needs to use >> "-c" flag to pick program to run every time >> >> - add a CLI option that allows the user to see what entry point classes >> are available >> for this, consider >> a) "program-class" entry >> b) "Main-Class" entry >> c) if neither is found, scan jar-file for classes implementing >> "Program" interface >> d) if still not found, scan jar-file for classes with "main" method >> >> - if user looks for entry point classes via CLI, check for >> "ProgramDesciption" interface and show info >> >> - extend WebUI to show all available entry-classes (pull request >> already there, for multiple entries in "program-class") >> >> - extend WebUI to show "ProgramDescription" info >> >> >> What do you think? I am not too sure about the "auto scan" of the jar >> file if no manifest entry is provided. We might get some "fat jars" and >> scanning might take some time. >> >> >> -Matthias >> >> >> >> >> On 05/19/2015 10:44 AM, Stephan Ewen wrote: >>> We actually has an interface like that before ("Program"). It is still >>> supported, but in all new programs we simply use the Java main method. >> The >>> advantage is that >>> most IDEs can create executable JARs automatically, setting the JAR >>> manifest attributes, etc. >>> >>> The "Program" interface still works, though. Most tool classes (like >>> "PackagedProgram") have a way to figure out whether the code uses >> "main()" >>> or implements "Program" >>> and calls the right method. >>> >>> You can try and extend the program interface. If you want to consistently >>> support multiple programs in one JAR file, you may need to adjust the >> util >>> classes as >>> well to deal with that. >>> >>> >>> >>> On Tue, May 19, 2015 at 10:10 AM, Matthias J. Sax < >>> mj...@informatik.hu-berlin.de> wrote: >>> >>>> Supporting an interface like this seems to be a nice idea. Any other >>>> opinions on it? >>>> >>>> It seems to be some more work to get it done right. I don't want to >>>> start working on it, before it's clear that it has a chance to be >>>> included in Flink. >>>> >>>> @Flavio: I moved the discussion to dev mailing list (user list is not >>>> appropriate for this discussion). Are you subscribed to it or should I >>>> cc you in each mail? >>>> >>>> >>>> -Matthias >>>> >>>> >>>> On 05/19/2015 09:39 AM, Flavio Pompermaier wrote: >>>>> Nice feature Matthias! >>>>> My suggestion is to create a specific Flink interface to get also >>>>> description of a job and standardize parameter passing. >>>>> Then, somewhere (e.g. Manifest) you could specify the list of packages >>>> (or >>>>> also directly the classes) to inspect with reflection to extract the >> list >>>>> of available Flink jobs. >>>>> Something like: >>>>> >>>>> public interface FlinkJob { >>>>> >>>>> /** The name to display in the job submission UI or shell */ >>>>> //e.g. "My Flink HelloWorld" >>>>> String getDisplayName(); >>>>> //e.g. "This program does this and that etc.." >>>>> String getDescription(); >>>>> //e.g. <0,Integer,"An integer representing my first param">, >>>> <1,String,"An >>>>> string representing my second param"> >>>>> List<Tuple3<Integer, TypeInfo, String>> paramDescription; >>>>> /** Set up the flink job in the passed ExecutionEnvironment */ >>>>> ExecutionEnvironment config(ExecutionEnvironment env); >>>>> } >>>>> >>>>> What do you think? >>>>> >>>>> >>>>> >>>>> On Sun, May 17, 2015 at 10:38 PM, Matthias J. Sax < >>>>> mj...@informatik.hu-berlin.de> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I like the idea that Flink's WebClient can show different plans for >>>>>> different jobs within a single jar file. >>>>>> >>>>>> I prepared a prototype for this feature. You can find it here: >>>>>> https://github.com/mjsax/flink/tree/multipleJobsWebUI >>>>>> >>>>>> To test the feature, you need to prepare a jar file, that contains the >>>>>> code of multiple programs and specify each entry class in the manifest >>>>>> file as comma separated values in "program-class" line. >>>>>> >>>>>> Feedback is welcome. :) >>>>>> >>>>>> >>>>>> -Matthias >>>>>> >>>>>> >>>>>> On 05/08/2015 03:08 PM, Flavio Pompermaier wrote: >>>>>>> Thank you all for the support! >>>>>>> It will be a really nice feature if the web client could be able to >>>> show >>>>>>> me the list of Flink jobs within my jar.. >>>>>>> it should be sufficient to mark them with a special annotation and >>>>>>> inspect the classes within the jar.. >>>>>>> >>>>>>> On Fri, May 8, 2015 at 3:03 PM, Malte Schwarzer <m...@mieo.de >>>>>>> <mailto:m...@mieo.de>> wrote: >>>>>>> >>>>>>> Hi Flavio, >>>>>>> >>>>>>> you also can put each job in a single class and use the –c >>>> parameter >>>>>>> to execute jobs separately: >>>>>>> >>>>>>> /bin/flink run –c com.myflinkjobs.JobA >>>> /path/to/jar/multiplejobs.jar >>>>>>> /bin/flink run –c com.myflinkjobs.JobB >>>> /path/to/jar/multiplejobs.jar >>>>>>> … >>>>>>> >>>>>>> Cheers >>>>>>> Malte >>>>>>> >>>>>>> Von: Robert Metzger <rmetz...@apache.org <mailto: >>>> rmetz...@apache.org >>>>>>>> >>>>>>> Antworten an: <u...@flink.apache.org <mailto: >> u...@flink.apache.org >>>>>> >>>>>>> Datum: Freitag, 8. Mai 2015 14:57 >>>>>>> An: "u...@flink.apache.org <mailto:u...@flink.apache.org>" >>>>>>> <u...@flink.apache.org <mailto:u...@flink.apache.org>> >>>>>>> Betreff: Re: Package multiple jobs in a single jar >>>>>>> >>>>>>> Hi Flavio, >>>>>>> >>>>>>> the pom from our quickstart is a good >>>>>>> reference: >>>>>> >>>> >> https://github.com/apache/flink/blob/master/flink-quickstart/flink-quickstart-java/src/main/resources/archetype-resources/pom.xml >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, May 8, 2015 at 2:53 PM, Flavio Pompermaier >>>>>>> <pomperma...@okkam.it <mailto:pomperma...@okkam.it>> wrote: >>>>>>> >>>>>>> Ok, get it. >>>>>>> And is there a reference pom.xml for shading my application >>>> into >>>>>>> one fat-jar? which flink dependencies can I exclude? >>>>>>> >>>>>>> On Fri, May 8, 2015 at 1:05 PM, Fabian Hueske < >>>> fhue...@gmail.com >>>>>>> <mailto:fhue...@gmail.com>> wrote: >>>>>>> >>>>>>> I didn't say that the main should return the >>>>>>> ExecutionEnvironment. >>>>>>> You can define and execute as many programs in a main >>>>>>> function as you like. >>>>>>> The program can be defined somewhere else, e.g., in a >>>>>>> function that receives an ExecutionEnvironment and >> attaches >>>>>>> a program such as >>>>>>> >>>>>>> public void buildMyProgram(ExecutionEnvironment env) { >>>>>>> DataSet<String> lines = env.readTextFile(...); >>>>>>> // do something >>>>>>> lines.writeAsText(...); >>>>>>> } >>>>>>> >>>>>>> That method could be invoked from main(): >>>>>>> >>>>>>> psv main() { >>>>>>> ExecutionEnv env = ... >>>>>>> >>>>>>> if(...) { >>>>>>> buildMyProgram(env); >>>>>>> } >>>>>>> else { >>>>>>> buildSomeOtherProg(env); >>>>>>> } >>>>>>> >>>>>>> env.execute(); >>>>>>> >>>>>>> // run some more programs >>>>>>> } >>>>>>> >>>>>>> 2015-05-08 12:56 GMT+02:00 Flavio Pompermaier >>>>>>> <pomperma...@okkam.it <mailto:pomperma...@okkam.it>>: >>>>>>> >>>>>>> Hi Fabian, >>>>>>> thanks for the response. >>>>>>> So my mains should be converted in a method returning >>>>>>> the ExecutionEnvironment. >>>>>>> However it think that it will be very nice to have a >>>>>>> syntax like the one of the Hadoop ProgramDriver to >>>>>>> define jobs to invoke from a single root class. >>>>>>> Do you think it could be useful? >>>>>>> >>>>>>> On Fri, May 8, 2015 at 12:42 PM, Fabian Hueske >>>>>>> <fhue...@gmail.com <mailto:fhue...@gmail.com>> >> wrote: >>>>>>> >>>>>>> You easily have multiple Flink programs in a >> single >>>>>>> JAR file. >>>>>>> A program is defined using an >> ExecutionEnvironment >>>>>>> and executed when you call >>>>>>> ExecutionEnvironment.exeucte(). >>>>>>> Where and how you do that does not matter. >>>>>>> >>>>>>> You can for example implement a main function >> such >>>>>> as: >>>>>>> >>>>>>> public static void main(String... args) { >>>>>>> >>>>>>> if (today == Monday) { >>>>>>> ExecutionEnvironment env = ... >>>>>>> // define Monday prog >>>>>>> env.execute() >>>>>>> } >>>>>>> else { >>>>>>> ExecutionEnvironment env = ... >>>>>>> // define other prog >>>>>>> env.execute() >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> 2015-05-08 11:41 GMT+02:00 Flavio Pompermaier >>>>>>> <pomperma...@okkam.it <mailto: >> pomperma...@okkam.it >>>>>>>> : >>>>>>> >>>>>>> Hi to all, >>>>>>> is there any way to keep multiple jobs in a >> jar >>>>>>> and then choose at runtime the one to execute >>>>>>> (like what ProgramDriver does in Hadoop)? >>>>>>> >>>>>>> Best, >>>>>>> Flavio >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> >
signature.asc
Description: OpenPGP digital signature