[VOTE] Apache Samza 0.9.1 RC1

2015-06-28 Thread Yi Pan
Hey all,

This is a call for a vote on a release of Apache Samza 0.9.1. This is a
bug-fix release against 0.9.0.

The release candidate can be downloaded from here:

http://people.apache.org/~nickpan47/samza-0.9.1-rc1/

The release candidate is signed with pgp key 911402D8, which is
included in the repository's KEYS file:

https://git-wip-us.apache.org/repos/asf?p=samza.git;a=blob_plain;f=KEYS;hb=6f5bafb6cd93934781161eb6b1868d11ea347c95

and can also be found on keyservers:

http://pgp.mit.edu/pks/lookup?op=getsearch=0x911402D8

The git tag is release-0.9.1-rc1 and signed with the same pgp key:

https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tag;h=e78b9e7f34650538b4bb68b338eb472b98a5709e

Test binaries have been published to Maven's staging repository, and are
available here:
https://repository.apache.org/content/repositories/orgapachesamza-1007/

Note release 0.9.1 is still supporting JDK6 and the binaries were built
with JDK6 without incident.

6 critical bugs were resolved for this release:

https://issues.apache.org/jira/browse/SAMZA-715?jql=project%20%3D%20SAMZA%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20%28Resolved%2C%20Closed%29

The vote will be open for 72 hours ( end in 12:00pm Wed, 07/01/2015 ).
Please download the release candidate, check the hashes/signature, build it
and test it, and then please vote:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)


Re: Samza and sliding window

2015-06-28 Thread Shekar Tippur
Milinda,

I see that the document you mentioned addresses windowing but I also need
to group by different applications.

ApplicationCount
---
A100
B40
C69


- Shekar

On Fri, Jun 26, 2015 at 11:39 AM, Shekar Tippur ctip...@gmail.com wrote:

 Never mind. I see it here:

 http://samza.apache.org/learn/documentation/0.8/container/windowing.html

 Thanks again Milinda.

 - Shekar

 On Fri, Jun 26, 2015 at 11:39 AM, Shekar Tippur ctip...@gmail.com wrote:

 Thanks Milinda.
 Is this feature available on 0.8 version of Samza?

 - Shekar

 On Fri, Jun 26, 2015 at 11:24 AM, Milinda Pathirage 
 mpath...@umail.iu.edu wrote:

 Hi Shekar,

 You can use Samza's local storage (

 http://samza.apache.org/learn/documentation/0.9/container/state-management.html
 )
 to keep the window state and windowing (
 http://samza.apache.org/learn/documentation/0.9/container/windowing.html
 )
 capabilities to handle the window advancement. During advancement you can
 update the local cache (Redis in your case). AFAIK, Samza doesn't provide
 any helpers or utilities to handle window state maintenance. You have to
 implement it on top of local storage or if you don't won't fault
 tolerance
 you can keep the state in-memory too (as long as the state fit in
 memory).

 Thanks
 Milinda

 On Fri, Jun 26, 2015 at 1:53 PM, Shekar Tippur ctip...@gmail.com
 wrote:

  Yan,
 
 
  *What do you mean by a local cache? Is it a db like MySQL, something
  likeRocksDB, or even just in-memory?*
 
  Local cache as in Redis
 
 
 
  *When you say another topic, is this the topic consumed by the same
  Samzajob as your 5-minutes-job, or in a separate job? What is the
  relationbetween the topic and the application name*
 
  We dont have a 5 min job. All we have now is a stream of events coming
 from
  a bunch of applications. All these land on a raw kafka topic. The
 stream
  data has application name. I want to create a job that takes incoming
  stream and group it by application name and count the number of events
 we
  get in a 5 min sliding window.
 
  - Shekar
 
  On Fri, Jun 26, 2015 at 10:29 AM, Yan Fang yanfang...@gmail.com
 wrote:
 
   Hi Shekar,
  
   Need a little more clarification.
  
   What do you mean by a local cache? Is it a db like MySQL, something
  like
   RocksDB, or even just in-memory?
  
   When you say another topic, is this the topic consumed by the same
  Samza
   job as your 5-minutes-job, or in a separate job? What is the relation
   between the topic and the application name?
  
   Thanks,
  
   Fang, Yan
   yanfang...@gmail.com
  
   On Fri, Jun 26, 2015 at 1:08 AM, Shekar Tippur ctip...@gmail.com
  wrote:
  
Hello,
My apologies if I have raised it earlier.
Here is the use case:
I have a stream that is partitioned based on application name. I
 want
  to
   be
able to count hte number of events happening for that particular
application in the past 5 minutes (sliding window) and update
 either
another topic or a local cache.
   
Is this possible via 0.9 version of Samza?
If not, what is the easiest way to achieve this?
   
- Shekar
   
  
 



 --
 Milinda Pathirage

 PhD Student | Research Assistant
 School of Informatics and Computing | Data to Insight Center
 Indiana University

 twitter: milindalakmal
 skype: milinda.pathirage
 blog: http://milinda.pathirage.org