Review Request 46296: SAMZA-932: JMX port collisions in JmxServer

2016-04-15 Thread Tao Feng

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46296/
---

Review request for samza.


Repository: samza


Description
---

SAMZA-932: JMX port collisions in JmxServer


Diffs
-

  samza-core/src/main/scala/org/apache/samza/metrics/JmxServer.scala 
e6204c10878589d34096378e6000709266a9b4a5 

Diff: https://reviews.apache.org/r/46296/diff/


Testing
---

./gradlew clean build && ./gradlew checkstyleMain checkstyleTest


Thanks,

Tao Feng



Re: Review Request 46287: Add a double serde.

2016-04-15 Thread Jake Maes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46287/#review129210
---


Ship it!




Ship It!

- Jake Maes


On April 15, 2016, 11:17 p.m., Jon Bringhurst wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46287/
> ---
> 
> (Updated April 15, 2016, 11:17 p.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-936
> https://issues.apache.org/jira/browse/SAMZA-936
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> Add a simple double serde.
> 
> 
> Diffs
> -
> 
>   samza-core/src/main/scala/org/apache/samza/serializers/DoubleSerde.scala 
> PRE-CREATION 
>   
> samza-core/src/test/scala/org/apache/samza/serializers/TestDoubleSerde.scala 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/46287/diff/
> 
> 
> Testing
> ---
> 
> A simple unit test was added.
> 
> 
> Thanks,
> 
> Jon Bringhurst
> 
>



Review Request 46287: Add a double serde.

2016-04-15 Thread Jon Bringhurst

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46287/
---

Review request for samza.


Bugs: SAMZA-936
https://issues.apache.org/jira/browse/SAMZA-936


Repository: samza


Description
---

Add a simple double serde.


Diffs
-

  samza-core/src/main/scala/org/apache/samza/serializers/DoubleSerde.scala 
PRE-CREATION 
  samza-core/src/test/scala/org/apache/samza/serializers/TestDoubleSerde.scala 
PRE-CREATION 

Diff: https://reviews.apache.org/r/46287/diff/


Testing
---

A simple unit test was added.


Thanks,

Jon Bringhurst



Re: Review Request 46282: SAMZA-928 document Kerberos on YARN

2016-04-15 Thread Chen Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46282/
---

(Updated April 15, 2016, 10:09 p.m.)


Review request for samza.


Repository: samza


Description
---

SAMZA-928 document Kerberos on YARN


Diffs (updated)
-

  docs/learn/documentation/versioned/jobs/yarn-jobs.md 827cc14 
  docs/learn/documentation/versioned/yarn/isolation.md 1eb3bf5 
  docs/learn/documentation/versioned/yarn/yarn-security.md PRE-CREATION 

Diff: https://reviews.apache.org/r/46282/diff/


Testing
---


Thanks,

Chen Song



Review Request 46282: SAMZA-928 document Kerberos on YARN

2016-04-15 Thread Chen Song

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46282/
---

Review request for samza.


Repository: samza


Description
---

SAMZA-928 document Kerberos on YARN


Diffs
-

  docs/learn/documentation/versioned/jobs/yarn-jobs.md 827cc14 
  docs/learn/documentation/versioned/yarn/isolation.md 1eb3bf5 
  docs/learn/documentation/versioned/yarn/yarn-security.md PRE-CREATION 

Diff: https://reviews.apache.org/r/46282/diff/


Testing
---


Thanks,

Chen Song



Re: Exactly once processing

2016-04-15 Thread Robert Crim
Looking at:
https://github.com/apache/samza/blob/f02386464d31b5a496bb0578838f51a0331bfffa/samza-core/src/main/scala/org/apache/samza/container/TaskInstance.scala#L171


The commit function, in order, does:
1. Flushes metrics
2. Flushes stores
3. Produces messages from the collectors
4. Write offsets

So I would reason that it would be OK to store an offset you've seen in the
store and use that to skip the messages if you've already mutated your data
-- but be aware any of 2 (if multiple stores) ,3, or 4 may not have
happened so you might want to do those again. You'd need to be careful if
your changes span multiple stores or keys since multiple writes to
changelogs are not atomic.

Question to maintainers: is it safe for Samza users to relay on this order?

On Fri, Apr 15, 2016 at 11:31 AM, Sabarish Sasidharan <
sabarish@gmail.com> wrote:

> Hi Guozhang
>
> Thanks. Assuming the checkpoint would typically be behind the offset
> persisted in my store (+ changelog), when the messages are replayed
> starting from the checkpoint, I can very well skip those by comparing
> against the offset in my store right? So I am not understanding why
> duplicates would affect my state.
>
> Regards
> Sab
>
> On Fri, Apr 15, 2016 at 10:07 PM, Guozhang Wang 
> wrote:
>
> > Hi Sab,
> >
> > For stateful processing where you have persistent state stores, you need
> to
> > maintain the checkpoint which includes the committed offsets as well as
> the
> > store flushed in sync, but right not these two operations are not done
> > atomically, and hence if you fail in between, you could still get
> > duplicates where you consume from the committed offsets while some of
> them
> > have already updated the stores.
> >
> > Guozhang
> >
> >
> > On Thu, Apr 14, 2016 at 11:56 PM, Sasidharan, Sabarish <
> > sabarish.sasidha...@harman.com> wrote:
> >
> > > Hi
> > >
> > > To achieve exactly once processing for my aggregates, wouldn’t it be
> > > enough if I maintain the latest offset processed for the aggregate and
> > > check against that offset when messages are replayed on recovery? Am I
> > > missing something here?
> > >
> > > Thanks
> > >
> > > Regards
> > > Sab
> >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: Exactly once processing

2016-04-15 Thread Guozhang Wang
Hi Sab,

For stateful processing where you have persistent state stores, you need to
maintain the checkpoint which includes the committed offsets as well as the
store flushed in sync, but right not these two operations are not done
atomically, and hence if you fail in between, you could still get
duplicates where you consume from the committed offsets while some of them
have already updated the stores.

Guozhang


On Thu, Apr 14, 2016 at 11:56 PM, Sasidharan, Sabarish <
sabarish.sasidha...@harman.com> wrote:

> Hi
>
> To achieve exactly once processing for my aggregates, wouldn’t it be
> enough if I maintain the latest offset processed for the aggregate and
> check against that offset when messages are replayed on recovery? Am I
> missing something here?
>
> Thanks
>
> Regards
> Sab




-- 
-- Guozhang