Re: How to do aggregation in Samza?

2015-10-21 Thread jeremy p
libraries? > > Sincerely, > Selina > > On Wed, Oct 21, 2015 at 8:15 AM, jeremy p > wrote: > > > Hey all, > > > > So, I'm wanting to do aggregate operations in Samza. Counts, averages, > > grouping, things of that nature. Basically, the kinds of

How to do aggregation in Samza?

2015-10-21 Thread jeremy p
Hey all, So, I'm wanting to do aggregate operations in Samza. Counts, averages, grouping, things of that nature. Basically, the kinds of aggregate operations you can do in SQL. What's the best way to do this in Samza? Are there any libraries for this? I noticed a few projects are currently in

Re: How to pass arguments to a Samza job

2015-04-21 Thread jeremy p
I just > > tested it out with hello-samza and it does work. > > > > deploy/samza/bin/run-job.sh > > > --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory > > --config-path=file://$PWD/deploy/samza/config/wikipedia-feed.properties >

Re: How to pass arguments to a Samza job

2015-04-21 Thread jeremy p
this class). You can pass commandline arguments with the > flag "--config key=value" in your run-job script. > > > On Tue, Apr 21, 2015 at 11:29 AM, jeremy p > > wrote: > > > Hello all, > > > > Is there a way to pass arguments to a Samza job from the com

How to pass arguments to a Samza job

2015-04-21 Thread jeremy p
Hello all, Is there a way to pass arguments to a Samza job from the command line? If not, is there any way to pass arguments to a Samza job besides the .properties file? Also, is there a way to pass application-specific properties using the .properties file? (such as my.random.config.value=foo)

Re: How to deal with bootstrapping

2015-04-17 Thread jeremy p
es. > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Thu, Apr 16, 2015 at 2:16 PM, Benjamin Black wrote: > > > New-Rules-Job will need to know the complete map of partitions to > offsets. > > > > On Thu, Apr 16, 2015 at 2:06 PM, jeremy p < > ath

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
learn how to do this? Thank you! On Thu, Apr 16, 2015 at 5:06 PM, jeremy p wrote: > Ben : I think we are talking about different things here. I'm not trying > to maintain ordering across a topic. I know that is not what Kafka and > Samza are meant for. What I'm trying to

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
em. To do otherwise produces bad times. > > On Thu, Apr 16, 2015 at 1:51 PM, jeremy p > wrote: > > > Thank you for the response. Does this mean the Old-Rules-Job would need > to > > maintain a Last-Processed-Old-Rules offset for each partition? > > > > On Thu,

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
rs and consumers. > > On Thu, Apr 16, 2015 at 1:01 PM, jeremy p > wrote: > > > Thanks to everybody for the responses! > > > > Yi : The queue must be processed in order, which means that I cannot use > > Ben and Guozhang's approach. > > > > Howe

Re: How to deal with bootstrapping

2015-04-16 Thread jeremy p
gt; > Then it sends signal to the job1 and shutdown job1, and applies all > > rules > > > > to the stream. > > > > > > > > In terms of how to shutdown the job1, here is one solution > > > > < > > > > > > > > > >

Re: How to deal with bootstrapping

2015-04-15 Thread jeremy p
If you can not tolerate the shortcoming, 1) get the offset of the > latest-processed message of old rules. 2) In your new task, ignore messages > before that offset for the old rules. 3) bootstrap. > > Hope this helps. Maybe your use case is more complicated? > > Thanks, > &g

Re: Maximum number of jobs

2015-04-15 Thread jeremy p
handle those connections. > > Can you describe your use case in more detail? Running 1 million jobs seems > like it might be a mis-use of this technology. > > Cheers, > Chris > > On Wed, Apr 15, 2015 at 10:24 AM, jeremy p > > wrote: > > > What's the maxim

How to deal with bootstrapping

2015-04-15 Thread jeremy p
So, I'm wanting to use Samza for a project I'm working on, but I keep running into a problem with bootstrapping. Let's say there's a Kafka topic called Numbers that I want to consume with Samza. Let's say each message has a single integer in it, and I want to classify it as even or odd. So I hav

Maximum number of jobs

2015-04-15 Thread jeremy p
What's the maximum number of Samza jobs I can run simultaneously on a single cluster? Let's say these jobs are very lightweight -- they require little memory or processing power. However, I need a lot of them -- let's say I need to have 1,000,000 running at any given time. Is this reasonable or

Re: New user here

2015-04-08 Thread jeremy p
obably > open an INFRA ticket to create a user list. > > -jakob > > On 8 April 2015 at 12:20, jeremy p wrote: > > Hello all, > > > > I am new to Samza. Super excited about the project, evaluating it to see > > if it's right for a project I'm working o

New user here

2015-04-08 Thread jeremy p
Hello all, I am new to Samza. Super excited about the project, evaluating it to see if it's right for a project I'm working on. I didn't see a u...@samza.apache.org mailing list. Is it okay to post questions about using Samza to this mailing list? --Jeremy