Re: Can't get all stored values via range iterator

2015-11-18 Thread Yi Pan
Hi, Alexander, Very glad that you figured it out! Thanks! -Yi On Tue, Nov 17, 2015 at 1:41 PM, Alexander Filipchik wrote: > Just want to update you on this one. After some time spent in debugging I > found that the actual problem was a piece of our code that was calling > next() on a range ite

Re: Sporadic errors in JobRunner

2015-11-18 Thread Yi Pan
Hi, Rick, I think that you are running into SAMZA-754. I have a RB available for it already. I will upload the patch and it would be good if you can try the patch to see whether that solves your problem. -Yi On Tue, Nov 17, 2015 at 12:01 PM, Rick Mangi wrote: > Hi, getting things working on sa

Re: Review Request 40313: SAMZA-785

2015-11-18 Thread VENKATA KRISHNA NANNAPANENI
> On Nov. 16, 2015, 7:34 p.m., Boris Shkolnik wrote: > > Is there anything I need to do like commit or create pull request? I wasn't aware of the process from here. - VENKATA KRISHNA --- This is an automatically generated e-mail. To re

Re: Review Request 40313: SAMZA-785

2015-11-18 Thread Navina Ramesh
> On Nov. 16, 2015, 7:34 p.m., Boris Shkolnik wrote: > > > > VENKATA KRISHNA NANNAPANENI wrote: > Is there anything I need to do like commit or create pull request? I > wasn't aware of the process from here. Sorry for the delay, Venkata Krishna Nannapaneni. We are currently closing up the

Re: Sporadic errors in JobRunner

2015-11-18 Thread Rick Mangi
I seem to have solved it by only specifying a single zookeeper node in my job config. Maybe a race condition of some sort? > On Nov 18, 2015, at 2:37 PM, Yi Pan wrote: > > Hi, Rick, > > I think that you are running into SAMZA-754. I have a RB available for it > already. I will upload the patc

Re: Sporadic errors in JobRunner

2015-11-18 Thread Rick Mangi
I take that back, it happened again. Will try your patch. > On Nov 18, 2015, at 3:36 PM, Rick Mangi wrote: > > I seem to have solved it by only specifying a single zookeeper node in my job > config. Maybe a race condition of some sort? > > >> On Nov 18, 2015, at 2:37 PM, Yi Pan wrote: >> >

Re: Sporadic errors in JobRunner

2015-11-18 Thread Rick Mangi
Sorry. Just read the bug. Yes, that makes sense. I deleted a bunch of topics and then hit this. > On Nov 18, 2015, at 3:42 PM, Rick Mangi wrote: > > I take that back, it happened again. Will try your patch. > > >> On Nov 18, 2015, at 3:36 PM, Rick Mangi wrote: >> >> I seem to have solved i

Re: Sporadic errors in JobRunner

2015-11-18 Thread Rick Mangi
That patch seems to have fixed the problem. > On Nov 18, 2015, at 3:43 PM, Rick Mangi wrote: > > Sorry. Just read the bug. Yes, that makes sense. I deleted a bunch of topics > and then hit this. > > >> On Nov 18, 2015, at 3:42 PM, Rick Mangi wrote: >> >> I take that back, it happened again

Re: Review Request 40421: SAMZA-754: fix the null oldestOffset issue in empty topics

2015-11-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40421/#review107088 --- Ship it! One question. Otherwise, lgtm! :) samza-kafka/src/main/

Re: Review Request 40313: SAMZA-785

2015-11-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40313/#review107091 --- samza-log4j/src/main/java/org/apache/samza/logging/log4j/StreamApp

Re: Review Request 40407: Added gauge metric to measure state store restoration time

2015-11-18 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40407/#review107095 --- Overall looks good. It would be nice to add some unit test as well.

Re: Review Request 40407: Added gauge metric to measure state store restoration time

2015-11-18 Thread Navina Ramesh
> On Nov. 18, 2015, 10:01 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java, > > line 79 > > > > > > If this is the only place to call t

Re: Review Request 40457: SAMZA-788 - coordinator stream configuration should not guess the system names

2015-11-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40457/ --- (Updated Nov. 18, 2015, 10:13 p.m.) Review request for samza, Boris Shkolnik, Y

Review Request 40457: SAMZA-788 - coordinator stream configuration should not guess the system names

2015-11-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40457/ --- Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jagadish Ve

Avro vs Protocol buffer for Samza output

2015-11-18 Thread Selina Tech
Dear All: I need to generate some data by Samza to Kafka and then write to Parquet formate file. I was asked why I choose Avro type as my Samza output to Kafka instead of Protocol Buffer. Since currently our data on Kafka are all Protocol buffer. I explained for Avro encoded message -

Re: Review Request 40407: Added gauge metric to measure state store restoration time

2015-11-18 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40407/ --- (Updated Nov. 19, 2015, 1:21 a.m.) Review request for samza, Yan Fang, Jake Mae

Re: Avro vs Protocol buffer for Samza output

2015-11-18 Thread Yi Pan
Hi, Selina, Samza's producer/consumer is highly tunable. You can configure it to use ProtocolBufferSerde class if your messages in Kafka are in ProtocolBuf format. The use of Avro in Kafka is LinkedIn's choice and does not necessarily fit others. For the sake of "why LinkedIn uses Avro", here is

Re: Avro vs Protocol buffer for Samza output

2015-11-18 Thread Selina Tech
Hi, Yi: Thanks for your reply. Do you mean there is no advantage of Avro message vs Protocol buffer message on Kafka except Avro schema registry? BTW, do you know how Kafka implement the Avro message? Does each Avro message include the schema or not? The size of Avro message is a big c

Re: Avro vs Protocol buffer for Samza output

2015-11-18 Thread Selina Tech
Hi, Yi: I think I got the answer as below: "The Kafka message format starts with a magic byte indicating what kind of serialization is used for this message. And if this byte indicates Avro, you can layout your message as starting with the schemaId and then followed by message payload. Upon c

Re: Avro vs Protocol buffer for Samza output

2015-11-18 Thread Yi Pan
Hi, Selina, On Wed, Nov 18, 2015 at 5:43 PM, Selina Tech wrote: > Hi, Yi: > Thanks for your reply. Do you mean there is no advantage of Avro > message vs Protocol buffer message on Kafka except Avro schema registry? > > Well, be careful about interpreting my words in this way. I did not

Re: Avro vs Protocol buffer for Samza output

2015-11-18 Thread Yi Pan
Yeah, this reduced-overhead message format calls for the need to have an Avro schema registry s.t. you can lookup the actual Avro schema via the schemaId. On Wed, Nov 18, 2015 at 5:53 PM, Selina Tech wrote: > Hi, Yi: > > I think I got the answer as below: > > "The Kafka message format starts

Re: question on yarn.container.cpu.cores

2015-11-18 Thread Chen Song
Thanks Navina So theoretically I can create a thread pool within a container. I know it is very hacky but it should increase parallelism. Chen On Mon, Nov 16, 2015 at 5:49 PM, Navina Ramesh wrote: > Hi Chen, > Samza container is still single threaded. In case of yarn based deployment, > Samza

Re: Review Request 40407: Added gauge metric to measure state store restoration time

2015-11-18 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40407/#review107143 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Nov. 19, 20

Re: Review Request 40421: SAMZA-754: fix the null oldestOffset issue in empty topics

2015-11-18 Thread Yi Pan (Data Infrastructure)
> On Nov. 18, 2015, 9:24 p.m., Navina Ramesh wrote: > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemAdmin.scala, > > line 199 > > > > > > Should we do a similar check for newestOffsets? So f

Review Request 40472: SAMZA-816: avoid creating and registering CoordinatorSystemConsumer in LocalityManager in SamzaContainer

2015-11-18 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40472/ --- Review request for samza, Yan Fang, Chinmay Soman, Chris Riccomini, and Navina R