Dear Matthieu
Yes, I am trying to port s4-yarn to 0.6.0.
-testMode option is what 0.6.0 has now in org.apache.s4.tools.Deploy.java,
// Explicitly shutdown the JVM since Gradle leaves non-daemon
threads running that delay the termination
if (!deployArgs.testMode) {
System.exit(0);
}
just like the -shutdown option in the same class file of S4-25.
// Explicitly shutdown the JVM since Gradle leaves non-daemon
threads running that delay the termination
if (deployArgs.shutdown) {
System.exit(0);
}
But the difference is
@Parameter(names = "-testMode", description = "Special mode for
regression testing", hidden = true)
@Parameter(names = "-shutdown", description = "Shutdown JVM after
deployment. Useful to avoid waiting for remaining long running threads from
Gradle", arity = 1)
I tried to pass "-testMode=true" to Deploy.main()
String [] argDeploy = {"-s4r=" + s4r_path_HDFS,
"-cluster=" + cluster_name,
"-appName=" + application_name,
"-testMode=true"
};
Deploy.main(argDeploy);
but got an error.
Cannot parse arguments: class com.beust.jcommander.ParameterException
-> Was passed main parameter 'true' but no main parameter was defined
Usage
Usage: <main class> [options]
Options:
-a, -appClass Full class name of the application class
(extending App or AdapterApp)
* -appName Name of S4 application.
* -c, -cluster Logical name of the S4 cluster
-debug Display debug information from the build system
Default: false
-gradleOpts gradle system properties (as in GRADLE_OPTS
environment properties) passed to
gradle scripts
Default: []
-help usage
Default: false
-modulesClasses, -emc, -mc Fully qualified class names of custom modules
Default: []
-modulesURIs, -mu URIs for fetching code of custom modules
Default: []
-namedStringParameters, -p Comma-separated list of inline configuration
parameters. Syntax:
'-p=name1=value1,name2=value2 '
Default: []
-s4r URI to existing s4r file
-timeout Connection timeout to Zookeeper, in ms
Default: 10000
-zk ZooKeeper connection string
Default: localhost:2181
Best Regards
Jihyoun
On Mon, Apr 8, 2013 at 4:23 PM, Matthieu Morel <[email protected]> wrote:
>
> On Apr 8, 2013, at 06:17 , JiHyoun Park wrote:
>
> Dear Matthieu
>
> What we need for the Yarn integration is just to include 2
> hdfs-deploy-related classes, which were developed at S4-25, in the s4
> core-deploy package.
>
> - org.apache.s4.deploy.HdfsFetcherModule.java
> - org.apache.s4.deploy.HdfsS4RFetcher.java
>
> And, simple modification at org.apache.s4.core.util.RemoteFileFetcher.java
> to be able to identify "hdfs" as one of s4r download sources.
>
> if ("hdfs".equalsIgnoreCase(scheme)){
> return new HdfsArchiveFetcher().fetch(uri);
> }
>
>
>
> Hi Jihyoun,
>
> adding Yarn/Hadoop dependencies in s4-core is something we want to avoid,
> so that we don't force a specific version of Hadoop.
>
> Instead, for S4 0.6, we could actually inject the fetchers through a
> custom module. We'd ship the custom module separately from s4-core,
> avoiding the dependency coupling issue.
>
> Can you add a ticket for this? Thanks!
>
>
>
> I also would like to ask you one more favour.
> Can we have the "-shutdown" option again at
> org.apache.s4.tools.Deploy.java to avoid automatic shutdown of S4
> application after deployment?
> I tried to use the "-testMode" option, which seemed to act just like the
> "-shutdown" option but my s4 application couldn't recognize the option.
>
>
> If I understand correctly, you tried to port s4-yarn to 0.6.0?
> Did you add the -testMode option in replacement of -shutdown=false here
> https://github.com/apache/incubator-s4/blob/S4-25/subprojects/s4-yarn/src/main/java/org/apache/s4/tools/yarn/S4YarnClient.java#L387
> ?
>
> Note that the S4 app being shut down without this option is actually a
> side effect of the deployment/configuration s4 tool on Yarn: we need to
> prevent system.exit statements since we are running in a contained
> environment.
>
> Also, if you have a working port of S4-25 to S4 0.6, you could submit a
> patch and we could integrate it. (if you are still iterating you can also
> fork the project on github and share your code of the port there, so we can
> provide feedback).
>
> Thanks,
>
> Matthieu
>
>
>
> Best Regards
> Jihyoun
>
>
> On Thu, Apr 4, 2013 at 5:26 PM, Matthieu Morel <[email protected]> wrote:
>
>> Hi,
>>
>> Note that S4 0.5 was a complete refactoring, therefore its main objective
>> was to provide a functional implementation. Thus there was room for
>> improvements and the focus of the 0.6 release was on performance and
>> usability.
>>
>> Most performance improvements in S4 0.6 come from:
>> - adding metrics to identify bottlenecks
>> - improving serialization and deserialization
>> - minimizing buffer copies (and pressure on the garbage collector)
>> - leveraging multithreading and async processing, notably by updating
>> Netty pipelines
>>
>> Regards,
>>
>> Matthieu
>>
>>
>>
>>
>> On Apr 4, 2013, at 07:01 , Siddharth wrote:
>>
>> ** ** **
>>
>> Hi - Can the development team highlight the exact solution/fix that made
>> it possible for 0.6 release to be so fast compared to the earlier release.
>> ****
>>
>> ** **
>>
>> Thanks in advance,****
>>
>> Siddharth****
>>
>> ** **
>> ------------------------------
>>
>> *From:* Matthieu Morel [mailto:[email protected]]
>> *Sent:* Wednesday, April 03, 2013 3:02 PM
>> *To:* [email protected]
>> *Subject:* Re: S4-0.6.0 and Hadoop Yarn****
>>
>> ** **
>>
>> On Apr 2, 2013, at 19:46 , Jeryl Cook wrote:****
>>
>>
>>
>> ****
>>
>> "handle 200K+ messages per sec" ,in one instance? or do you mean
>> clustered?****
>>
>> ** **
>>
>> This is for processing small events injected into 1 stream on 1 node. By
>> using more streams and more nodes the overall throughput can get quite
>> higher. ****
>>
>> ** **
>>
>> Note that this is a baseline with a basic PE graph (1 injector and 1 PE
>> prototype) and performance in practice will be impacted by the complexity
>> of the application and the nature of the processing, the hardware and
>> allocated resources, the size and complexity of messages etc..****
>>
>> ** **
>>
>> A benchmarking framework is included in the distribution, so you can
>> reproduce the experiments.****
>>
>> ** **
>>
>> Regards,****
>>
>> ** **
>>
>> Matthieu ****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> On Mon, Apr 1, 2013 at 10:42 PM, ****JiHyoun** **Park**** <
>> [email protected]> wrote:****
>>
>> Hi
>>
>> I am testing the newest release of S4.
>> It's fantastic that the stream throughput of S4 0.6.0 has been improved
>> to handle 200K+ messages per sec.!
>> However, it seems that S4-25 branch - deploying S4 applications with Yarn
>> - is not included in the 0.6.0 package yet.
>> I already built a system to run S4 applications on Yarn and want to
>> migrate its S4 framework from 0.5.0 to 0.6.0.
>> How can I use the 'deploying S4 applications with Yarn' feature on S4
>> 0.6.0?
>>
>> Best Regards
>> Jihyoun****
>>
>>
>>
>>
>> --
>> Jeryl Cook
>> Founder & Chief Executive Officer
>> VanitySoft, Inc.
>> A Geo Business Intelligence Technology Consulting Firm
>> www.vanity-soft.com
>> www.linkedin.com/in/jerylcook
>> Get answers to "who knew what, when, and where"... and everything in
>> between.
>>
>> ____________________________________________________
>> This message contains information which may be confidential and
>> privileged. Unless you are the addressee (or authorized to receive for the
>> addressee), you may not use, copy or disclose to anyone the message or any
>> information contained in the message. If you have received the message in
>> error, please advise the sender by reply e-mail
>> [email protected], and delete the message. ****
>>
>> ** **
>> ******
>>
>>
>>
>
>