Re: Kafka dependency

2016-05-10 Thread Yi Pan
Hi, Nick, We do have plan to update the Kafka dependency in Samza. However, Samza only uses Kafka client library. We have confirmed that any Kafka 0.8.2 clients should be supported by Kafka 0.9 brokers. Hence, it should not block you if you are thinking of upgrading Kafka broker versions (e.g. Lin

Re: Kafka dependency

2016-05-10 Thread Yi Pan
ficial 0.10.0 release. > > Thanks! > Nick > > > -----Original Message- > From: Yi Pan [mailto:nickpa...@gmail.com] > Sent: Tuesday, May 10, 2016 11:22 AM > To: dev@samza.apache.org > Subject: Re: Kafka dependency > > Hi, Nick, > > We do have plan to u

Re: help me about samza metrics

2016-05-11 Thread Yi Pan
Hi, Shaodong, Could you clarify what's your specific question? I can see that you configured your job to use snapshot metrics reporter and you are getting the JSON format snapshot report as expected. What's wrong here? On Tue, May 10, 2016 at 1:25 AM, SSHOU wrote: > Sorry, forget Task config fi

Re: samza job start takes 20 minutes to figure out the Checkpointed offset

2016-05-12 Thread Yi Pan
;m really looking forward to samza support for kafka 0.9.0, I saw some > discussion about this topic in the email list, I guess I have to wait for a > while. > > > > On 10 May 2016 at 01:24, Yi Pan wrote: > > > Hi, Bo, > > > > I embedded my answers in-between: &

Re: 0.10.1 Release

2016-05-12 Thread Yi Pan
Hi, Andy, We are doing some pre-release work at this moment. My rough estimation on 0.10.1 timeline would be about 1 month away. Thanks a lot! -Yi On Thu, May 12, 2016 at 9:56 AM, Andy Throgmorton wrote: > Hi, > > I'm wondering if anyone has a rough estimate on when 0.10.1 will be > released?

Re: Java 8, RocksDB and Samza 0.10.0

2016-05-19 Thread Yi Pan
Hi, all, Samza 0.10 was test and validated using Java 8 and Redhat 6.6 in LinkedIn. The rocksDB native library issue was not seen in our runtime environment. We did see a unit test failure and was able to track back to the issue reported to RocksDB here: https://github.com/facebook/rocksdb/issues/

Re: Samza job killed by left orphaned on YARN

2016-05-19 Thread Yi Pan
Hi, David and all, The "ultimate" solution is probably to implement SAMZA-871 , which allows Samza JobCoordinator directly identifies whether a container is alive or not w/o dependency on the cluster management systems. This is also considered toget

Re: Java 8, RocksDB and Samza 0.10.0

2016-05-23 Thread Yi Pan
Awesome! @Louis, do you mind to contribute a section to FAQ for this issue? That would help all users who encounter this issue later. Thanks! -Yi On Fri, May 20, 2016 at 10:09 AM, Louis Calisi wrote: > We finally found the issue. The two links in my original email lead me to > believe it was w

Re: Java 8, RocksDB and Samza 0.10.0

2016-05-23 Thread Yi Pan
t; I don’t have permission to edit it. > > > Louis > > > > On 5/23/16, 1:17 PM, "Yi Pan" wrote: > > >Awesome! @Louis, do you mind to contribute a section to FAQ for this > issue? > >That would help all users who encounter this issue later. > >

Merged Kafka/Samza meetup @LinkedIn

2016-05-27 Thread Yi Pan
Hi, all, In order to organize better quality talks @ more frequent cadence, we are merging the Kafka/Samza meetups @ LinkedIn. The new meetup is titled Stream Processing Meetup @ LinkedIn and the first one was just announced on 6/15. Please see the details at the link here: http://www.meetup.com/

Re: Update all values in RocksDB

2016-06-06 Thread Yi Pan
Hi, David, I would recommend to keep a separate table of closed sessions as a "queue", ordered by the time the session is closed. And in your window method, just create an iterator in the "queue" and only make progress toward the end of the "queue", and do a point deletion in the sessionStore, whi

Re: Update all values in RocksDB

2016-06-07 Thread Yi Pan
n through the entries can > be a slow process? > > Thanks, > David > > On Mon, Jun 6, 2016 at 2:34 PM, Yi Pan wrote: > > > Hi, David, > > > > I would recommend to keep a separate table of closed sessions as a > "queue", > > ordered by the time the s

Re: No updates to some of the store changelog partitions

2016-06-13 Thread Yi Pan
Hi, David, Did you check the log to see whether there is any log lines indicating the producer issues on the three partitions that you suspect? And could you also check whether you have auto-commit turned on? If your auto-commit is on and producer does not report any issue writing to the changelog

Re: Manually Commit Offsets?

2016-06-14 Thread Yi Pan
Hi, Jeremiah, Samza does support manual checkpointing. You can following the steps below: 1) turn off auto-commit by setting task.commit.ms=-1 2) in your code, call TaskContext.commit() whenever you are ready to checkpoint. We have applications in LinkedIn using this pattern to successfully imple

Re: Manually Commit Offsets?

2016-06-14 Thread Yi Pan
Sorry. Correction: > 2) in your code, call TaskContext.commit() whenever you are ready to > checkpoint. > > *TaskCoordinator.commit()* > > On Tue, Jun 14, 2016 at 10:16 AM, Jeremiah Adams < > jad...@helixeducation.com> wrote: > >> We need to send messages to a remote service. I need to impleme

Re: Manually Commit Offsets?

2016-06-15 Thread Yi Pan
t see how > to wire it into my StreamTask. > > > Jeremiah Adams > Software Engineer > www.helixeducation.com > Blog | Twitter | Facebook | LinkedIn > > > From: Yi Pan > Sent: Tuesday, June 14, 2016 2:28 PM > To: dev@samza.apache

Re: Manually Commit Offsets?

2016-06-16 Thread Yi Pan
x27;pause' the job or stop processing kafka from inside a > StreamTask.process() method? That would work for me too. > > > Jeremiah Adams > Software Engineer > www.helixeducation.com > Blog | Twitter | Facebook | LinkedIn > > ____________ >

Re: Bug in SequenceFileHdfsFileWriter

2016-06-16 Thread Yi Pan
Hi, Benjamin, Thanks a lot for reporting this! It makes sense from reading the posts. Could you open a JIRA? Are you interested in assigning to yourself and contribute the fix? Thanks a lot again! -Yi On Thu, Jun 16, 2016 at 9:52 AM, Benjamin Smith < ben.sm...@ranksoftwareinc.com> wrote: > > H

Re: A magic question

2016-06-29 Thread Yi Pan
Hi, Shaodong, Could you try to paste the graphs somewhere else? Apache mailing list seems to remove all the embedded images in your email. Hence, I can not see what your exact problem is. Thanks! On Wed, Jun 29, 2016 at 12:58 AM, 吴少东 wrote: > Hello everyone: > > When use *Metrics* > <

Re: The best way to import data into kv store?

2016-07-06 Thread Yi Pan
Hi, Sining, There are a few questions to be asked s.t. we know your application use case better. 1) In what format is your old userid-db data? 2) Is the old userid-db data partitioned using the same key and the same number of partitions as you expect to consume in your Samza job? Generally speak

Re: flushing changelog & checkpointing

2016-07-06 Thread Yi Pan
Hi, Buvana, Please see answers below. On Tue, Jul 5, 2016 at 11:47 AM, Ramanan, Buvana (Nokia - US) < buvana.rama...@nokia-bell-labs.com> wrote: > > Does this mean that all writes to the disk for state store purposes will > be done at the checkpointing time (which is also the time Samza checkpoi

Re: The best way to import data into kv store?

2016-07-11 Thread Yi Pan
ort_uid" topic, write to the kv store. > 4) In the task, when processing my realtime stream, read from kv store and > do the join. > > The "key point" is import data with bootstrap stream. > Is this your "batch-to-stream" approach mean? > > > On Thu, Jul 7, 2

Re: [NEED COMMENTS] import-control & checkstyle plugin

2016-07-11 Thread Yi Pan
+1 on removing the import control. The original idea to include the checkstyle.xml is to enforce some coding style guidelines, not to strictly control the imports. W/ the outdated import control list, it practically does not serve the purpose... On Mon, Jul 11, 2016 at 4:02 PM, Navina Ramesh wrot

Re: SamzaSQL document required

2016-07-27 Thread Yi Pan
Hi, Ankita, There is no official release documentation for SamzaSQL yet. If you are referring to the paper in HPBDC this year by Milinda, it is based on several patches under development. I will start by listing the relevant JIRAs: - SAMZA-390: the over-arching ticket describing the view of SQL on

Re: HdfsWriter opens a Bucket for every new file

2016-07-27 Thread Yi Pan
@Thees, Could you open a JIRA to track this issue? And could you also describe the issue in more specific details in the JIRA? e.g. when you mentioned that "HdfsWriter opens a Bucket for every new file", do you mean that HDFSWriter will open a new file everytime a new event is sent via HdfsSystemP

Re: [DISCUSS] [VOTE] Apache Samza 0.10.1 RC0

2016-08-01 Thread Yi Pan
Hi, Navina, Yes. You should put your key in the KEYS file. And +1 (binding) as I built and ran the zopkio integration test successfully as well. On Mon, Aug 1, 2016 at 1:05 PM, Navina Ramesh wrote: > @Garry: Yeah. I wasn't very clear about it either. > > @Yi: Do you know if it is a part of the

Re: Different Serde for Store and Changelog

2016-08-03 Thread Yi Pan
Hi, Nick, Thanks a lot for the input. Does it work for you if you only encrypt the value? If that works, you won't have the problem w/ the order of keys in RocksDB store. Regarding to the decryption cost, if you enable the cache store, most of the cache access is to get the deserialized objects. H

Re: Kafka Streams

2016-08-03 Thread Yi Pan
Hi, Nick, IMHO, there are following points that differs Samza from KStreams: - Stability of local state management. Samza supports durable local state and host-affinity for faster state recovery. 0.10.1 makes further progress in host-affinity to allow a) continuous check-pointing of state store;

Re: Samza yarn job - cannot bind to local host

2016-08-04 Thread Yi Pan
Hi, Shekar, Did you check your firewall configuration? Could you also paste your configuration, especially task.opts? -Yi On Wed, Aug 3, 2016 at 5:56 PM, Shekar Tippur wrote: > I am trying to submit a Samza job to yarn and I get a error: > > Exception in thread "main" java.io.IOException: Cann

Re: Samza yarn job - cannot bind to local host

2016-08-04 Thread Yi Pan
ory > > > > > > # Systems > > > > systems.kafka.samza.factory=org.apache.samza.system.kafka.Ka > > fkaSystemFactory > > > > systems.kafka.samza.msg.serde=json > > > > > > systems.kafka.consumer.zookeeper.connect=host1:2181,

Re: [DISCUSS] [VOTE] Apache Samza 0.10.1 RC0

2016-08-07 Thread Yi Pan
It has been more than 5 days and we have got 3 +1 (binding) and 5 +1 (non-binding) already. Can we conclude this vote? Thanks! On Tue, Aug 2, 2016 at 1:10 PM, Boris Shkolnik wrote: > +1 (non-binding). > > Boris. > > On Mon, Aug 1, 2016 at 11:39 AM, Navina Ramesh > > wrote: > > > Hey all, > > >

Re: [DISCUSS] [VOTE] Apache Samza 0.10.1 RC0

2016-08-07 Thread Yi Pan
Navina > > > On Sun, Aug 7, 2016 at 7:53 PM, Yi Pan wrote: > > > It has been more than 5 days and we have got 3 +1 (binding) and 5 +1 > > (non-binding) already. Can we conclude this vote? Thanks! > > > > On Tue, Aug 2, 2016 at 1:10 PM, Boris Shkolnik wrote: &g

Upcoming Streams Meetup @LinkedIn

2016-08-09 Thread Yi Pan
Hi, all, I am pleased to announce that LinkedIn invites you to attend a Streams Processing meetup on Tuesday, August 23 at our Mountain View campus. There will be speakers from LinkedIn, Confluent, and TripAdvisor. Plea

Re: Question on changelog partition mapping

2016-08-11 Thread Yi Pan
Hi, Tommy, Which version of Samza are you using? Since 0.10, the changelog partition mapping has been moved to the coordinator stream, not in the checkpoint topic any more. That said, I want to ask a few more questions to understand what you referred to as "non-deterministic" behavior. So, betwee

Re: Question on changelog partition mapping

2016-08-16 Thread Yi Pan
ch time it runs, even if the number of tasks is the > same. Does that make sense? The code has changed some since 0.9.1 but > seems to have the same issue even in 0.10.1. > > -Tommy > > On 08/11/2016 06:12 PM, Yi Pan wrote: > > Hi, Tommy, > > Which version of Samza are y

Re: Samza container hang on exception

2016-08-21 Thread Yi Pan
Hi, Sining, This is a known bug that is fixed in 0.10.1 (SAMZA-911). Please try to upgrade to 0.10.1. Thanks! -Yi On Sun, Aug 21, 2016 at 5:55 AM, 李斯宁 wrote: > I have tried restart every kafka server. The container did not recover. > > log have something below: > > 2016-08-21 20:08:21 [WARN

Re: kafka dependency version

2016-08-22 Thread Yi Pan
Hi, Gaurav, There is already an effort going on for this one: SAMZA-855 . It would be good if you can try out the patch. Thanks! -Yi On Mon, Aug 22, 2016 at 1:11 AM, Gaurav Agarwal wrote: > My initial attempt to build against kafka 0.9.0 or 0.1

Re: [DISCUSS] Samza 0.11.0 release

2016-08-24 Thread Yi Pan
Hi, Nicolas, Could you explain to me why Samza is blocking you from upgrading your Kafka brokers to 0.10? At LinkedIn, we are running Samza 0.10 w/ Kafka 0.10 brokers. This is a valid combination since Kafka 0.10 brokers should be backward compatible w/ 0.8.2 clients (which is the version Samza us

Re: [DISCUSS] Samza 0.11.0 release

2016-08-26 Thread Yi Pan
a timestamp field. Kafka 0.10.0 is backwards > compatible with 0.8.x clients but we are concerned about the performance > impact, see > http://kafka.apache.org/documentation.html#upgrade_10_performance_impact. > > Cheers, > > Nicolas > > On 25 Aug 2016 6:20 a.m., "Yi

Re: Question on changelog partition mapping

2016-08-26 Thread Yi Pan
you consider accepting a PR that makes this > change to the standard groupers? It's just strange that the generated > partition mappings can vary like this, even for identical inputs. > > -Tommy > > > On 08/16/2016 03:04 PM, Yi Pan wrote: > > Hi, Tommy, > >

Re: Review Request 51252: SAMZA-1004: Fix some logging and javadoc issues for AsyncStreamTask

2016-09-01 Thread Yi Pan (Data Infrastructure)
t; This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51252/ > --- > > (Updated Aug. 30, 2016, 10:01 p.m.) > > > Review request for samza, Navina Ramesh and Yi Pan (Data Infrastructure). &g

Re: Review Request 51252: SAMZA-1004: Fix some logging and javadoc issues for AsyncStreamTask

2016-09-01 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51252/#review147582 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Aug. 30

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-02 Thread Yi Pan (Data Infrastructure)
> If the goal of this is to actually test the lambdas, then ignore this > > feedback. Sure. I will make the complex lambdas as predefined functions. I assume that you mainly refer to sink()? - Yi --- This is an automatically g

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-02 Thread Yi Pan (Data Infrastructure)
t; line 33 > > <https://reviews.apache.org/r/47835/diff/13/?file=1487061#file1487061line33> > > > > Since this extends Message, I expected to see some @Override > > annotations, unless Message is an empty abstract class. Message actually is an interface class. >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-02 Thread Yi Pan (Data Infrastructure)
the type parameter, there is no way to find out the key type in build time and use that to define the type parameters for window and join functions. Let's discuss in person tomorrow. - Yi --- This is an automatically generated e-m

Re: Review Request 51126: SAMZA 998: Documentation updates for refactored Job Coordinator

2016-09-07 Thread Yi Pan (Data Infrastructure)
ent215399> Same here. docs/learn/documentation/versioned/jobs/configuration-table.html (line 1597) <https://reviews.apache.org/r/51126/#comment215400> Same. - Yi Pan (Data Infrastructure) On Aug. 16, 2016, 2:18 a.m., Jagadish Venkatraman wrote: > > -

Re: Review Request 50174: SAMZA-977: User doc for samza multithreading

2016-09-07 Thread Yi Pan (Data Infrastructure)
arn/documentation/versioned/container/event-loop.md (line 50) <https://reviews.apache.org/r/50174/#comment215426> from *a* different user thread. - Yi Pan (Data Infrastructure) On Sept. 7, 2016, 5:16 p.m., Xinyu Liu wrote: > > -

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-12 Thread Yi Pan (Data Infrastructure)
essageStram.apply(new MyCustomFilter()) > > ``` > > > > The former doesn't read like English. Could be slightly better if the > > name was "filterWith" but apply() still feels best. > > Yi Pan (Data Infrastructure) wrote: > Let me think ab

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-12 Thread Yi Pan (Data Infrastructure)
/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-12 Thread Yi Pan (Data Infrastructure)
return new Function() { > > @Override > > public Boolean apply(S s) { > > return lhs.apply(s) && rhs.apply(s); > > } > > }; > > } > > } > > ``` > > > > Main point though is to use immutability. > > Yi

Re: Review Request 51613: Fix SAMZA-842

2016-09-12 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51613/#review148593 --- Ship it! lgtm! - Yi Pan (Data Infrastructure) On Sept. 2

Re: Review Request 51126: SAMZA 998: Documentation updates for refactored Job Coordinator

2016-09-12 Thread Yi Pan (Data Infrastructure)
uot;, not just use it as a denominator between words. There are tons of violations in our current config. But it would be good to avoid introducing new config w/ the mistake. Not necessarily to be included in this review and should be kept in mind. - Yi Pan (Data Infrastructure)

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-12 Thread Yi Pan (Data Infrastructure)
x27;t it clearer to have one loop like below instead of two embedded loops: while (!isShutdown) { if (!reader.hasNext()) { break; } IncomingMessageEnvelope messageEnvelope = reader.readNext(); try { super.put()

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-12 Thread Yi Pan (Data Infrastructure)
tition (with partition groups). How does it work here? If we assume that "offset" here only refers to fileOffset, please clarify and discard this comment. - Yi Pan (Data Infrastructure) On Sept. 9, 2016, 1:34 a.m., Hai Lu wrote: > >

Re: Review Request 50174: SAMZA-977: User doc for samza multithreading

2016-09-13 Thread Yi Pan (Data Infrastructure)
/documentation/versioned/container/event-loop.md (line 28) <https://reviews.apache.org/r/50174/#comment216325> What about thread-safety among multiple process() operations? I thought that multiple process() can be in multiple threads as well? - Yi Pan (Data Infrastructure) On Sept. 13, 2

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-13 Thread Yi Pan (Data Infrastructure)
16369> nit: unnecessary re-formatting samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala (line 338) <https://reviews.apache.org/r/51142/#comment216370> nit: unnecessary re-formatting samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/Tes

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-13 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 12:33 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemAdmin.java, > > line 91 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493800#file1493800line91> > > > > You

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-14 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 1:37 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java, > > line 83 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493803#file1493803line83> > > &

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-14 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 1:37 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/AvroFileHdfsReader.java, > > line 24 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493806#file1493806line24> > >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review148787 --- On Sept. 12, 2016, 5:53 p.m., Yi Pan (Data Infrastructure) wrote: >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
by making a better version of MessageStream.join() public. For now, let's pond on it a bit more before making the change. - Yi ----------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
joinSource2) > > .join(joinSource3) > > .window(SessionWindows.into ()) > > .sink(SinkFunction) > > Yi Pan (Data Infrastructure) wrote: > As we discussed offline, there are more details in why the above join() > does not play well w/ t

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
.java PRE-CREATION samza-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-16 Thread Yi Pan (Data Infrastructure)
; > > This doesn't appear to be testing anything (other than that types > > check). This RB is limited to APIs. I will add more assert/test code to ensure the internal variable/methods are setup/invoked correctly. > On Sept. 14, 2016, 7:03 p.m., Chris Pettitt wrote: &g

Re: Review Request 50174: SAMZA-977: User doc for samza multithreading

2016-09-16 Thread Yi Pan (Data Infrastructure)
ps://reviews.apache.org/r/50174/#comment216843> Isn't this just talking about commit() is mutally exclusive to process/processAsync and window? We can simply state: - Checkpointing is guaranteed to only cover events that are fully processed. It is persisted in commit(

Re: Review Request 51962: SAMZA-1021 Remove the redundent poll waiting inside AsyncRunLoop blockIfBusy

2016-09-16 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51962/#review149278 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Sept

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-18 Thread Yi Pan (Data Infrastructure)
lass need to be exposed? Interfaces are generally > > nicer to program against for extensibility and testing. Though in this case > > it looks like Window is ABC not an interface. > > Yi Pan (Data Infrastructure) wrote: > Window is an ABC, and is intended to be used

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-18 Thread Yi Pan (Data Infrastructure)
To reply, visit: https://reviews.apache.org/r/47835/#review148908 ----------- On Sept. 14, 2016, 8:53 a.m., Yi Pan (Data Infrastructure) wrote: > > --- > This is an au

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-19 Thread Yi Pan (Data Infrastructure)
t: https://reviews.apache.org/r/47835/#review149545 ------- On Sept. 14, 2016, 8:53 a.m., Yi Pan (Data Infrastructure) wrote: > > --- > This is an automatically generated e

Re: Review Request 51126: SAMZA 998: Documentation updates for refactored Job Coordinator

2016-09-19 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51126/#review149579 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Sept

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-21 Thread Yi Pan (Data Infrastructure)
iating a new `SessionWindow` object but, it maybe a good idea to > > expose this as a builder in `Windows`. > > > > Let me know what you think. > > Yi Pan (Data Infrastructure) wrote: > Discussed offline. This can be a simple addition to the existing API

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-21 Thread Yi Pan (Data Infrastructure)
easily when we figure out all to allow users to customize window operator's state later. - Yi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review149724 ---

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-23 Thread Yi Pan (Data Infrastructure)
samza-sql-core/src/main/java/org/apache/samza/system/sql/Offset.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/task/sql/RouterMessageCollector.java > PRE-CREATION > > samza-sql-core/src/main/java/org/apache/samza/task/sql/SimpleMessageCollector.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeTest.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/task/sql/RandomWindowOperatorTask.java > PRE-CREATION > samza-sql-core/src/test/java/org/apache/samza/task/sql/StreamSqlTask.java > PRE-CREATION > > samza-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java > PRE-CREATION > settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 > > Diff: https://reviews.apache.org/r/47835/diff/ > > > Testing > --- > > ./gradlew clean build > > > Thanks, > > Yi Pan (Data Infrastructure) > >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-23 Thread Yi Pan (Data Infrastructure)
://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-28 Thread Yi Pan (Data Infrastructure)
/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 1:37 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java, > > line 58 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493803#file1493803line58> >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/MultiFileHdfsReader.java, > > line 59 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493808#file1493808line59> > > > &

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala, > > line 66 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493810#file1493810line66> > > > > It

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51142/ > ------- > > (Updated Sept. 28, 2016, 9:57 p.m.) > > > Review request for samza, Yi Pan (Data In

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 12:33 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemConsumer.java, > > line 142 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493801#file1493801line142> > > > >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala, > > line 38 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493812#file1493812line38> > > > >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
ly log a JIRA for improvement here though, since I heard from Venice team that Hadoop actually is less reliable than Kafka. - Yi Pan (Data Infrastructure) On Sept. 28, 2016, 9:57 p.m., Hai Lu wrote: > > --- > This is an automatically

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/MultiFileHdfsReader.java, > > line 59 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493808#file1493808line59> > > > &

Re: Review Request 52403: SAMZA-1028: Moving logline before closing kafka producer and making exception thrown AtomicReference

2016-09-29 Thread Yi Pan (Data Infrastructure)
g/r/52403/#comment219045> So, close() w/ timeout 0 is only available in Kafka 0.9 and above? I think that for 0.11, let's still use Kafka 0.8.2. In 0.12, we should include SAMZA-855 and upgrade to Kafka 0.10. - Yi Pan (Data Infrastructure) On Sept. 29, 2016, 9:23 p.m.

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
stMultiFileHdfsReader.java (line 17) <https://reviews.apache.org/r/51142/#comment219061> nit: I would recommend to test negative case where the offset is out-of-range as well. - Yi Pan (Data Infrastructure) On Sept. 28, 2016, 9:57 p.m., Hai Lu wrote: > > -

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
tps://reviews.apache.org/r/51142/#comment219062> Question: why do we need this in open source? Don't we already have a run-job.sh in open source that is general for any YARN application? - Yi Pan (Data Infrastructure) On Sept. 28, 2016, 9:57 p.m., H

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-10-03 Thread Yi Pan (Data Infrastructure)
> On Sept. 29, 2016, 10:02 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala, > > line 197 > > <https://reviews.apache.org/r/51142/diff/5-7/?file=1493810#file1493810line197> > > > >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-03 Thread Yi Pan (Data Infrastructure)
is both for unit test and demo. We can move them to examples as you suggest later. - Yi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review151227 -

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-03 Thread Yi Pan (Data Infrastructure)
ssage? Yes. It is a placeholder. I only added this as an implementation of Message since we are focusing on window operator implementation now. > On Oct. 4, 2016, 12:10 a.m., Jake Maes wrote: > > samza-operator/src/main/java/org/apache/samza/operators/api/MessageStream.java, >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-03 Thread Yi Pan (Data Infrastructure)
radle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-04 Thread Yi Pan (Data Infrastructure)
://reviews.apache.org/r/47994/diff/ Testing --- ./gradlew clean build. Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-04 Thread Yi Pan (Data Infrastructure)
object. I will remove them for now. - Yi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review151345 ----------- On Oct. 4, 2016, 12

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-05 Thread Yi Pan (Data Infrastructure)
39> > > > > I wonder if this is a candidate for being made `final`? From what I can > > tell, this is not modified elsewhere. Make sense. Fixed. - Yi --- This is an automatically generated e-mail. To reply, vi

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-05 Thread Yi Pan (Data Infrastructure)
sting --- ./gradlew clean build. Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-05 Thread Yi Pan (Data Infrastructure)
> > > > > Do we also need the `offset` of the incoming message that flows through > > each of these operators? > > > > Ideally, the offset should be a part of the context.(since, this RB is > > just for the wire-up, I'm certainly open t

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-08 Thread Yi Pan (Data Infrastructure)
ign. It seems that that is not enough to help understanding the layered representation of the DAG (from programming to representation to implementation). I will try to embed something in the code then. Closing this one since the first issue is similar and is kept open. - Yi --------

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-08 Thread Yi Pan (Data Infrastructure)
-operator/src/test/java/org/apache/samza/task/BroadcastOperatorTask.java PRE-CREATION samza-operator/src/test/java/org/apache/samza/task/InputJsonSystemMessage.java PRE-CREATION Diff: https://reviews.apache.org/r/47994/diff/ Testing --- ./gradlew clean build. Thanks, Yi Pan (Data

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-10 Thread Yi Pan (Data Infrastructure)
ology info. > > > > 3. Does the order of the Operators in the list have any meaning? e.g. > > does it implicitly define the order of processing, or is it just for > > consistency, or is the List used to allow duplicates? > > Yi Pan (Data Infrastructur

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-16 Thread Yi Pan (Data Infrastructure)
ology info. > > > > 3. Does the order of the Operators in the list have any meaning? e.g. > > does it implicitly define the order of processing, or is it just for > > consistency, or is the List used to allow duplicates? > > Yi Pan (Data Infrastructur

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-20 Thread Yi Pan (Data Infrastructure)
/InputJsonSystemMessage.java PRE-CREATION Diff: https://reviews.apache.org/r/47994/diff/ Testing --- ./gradlew clean build. Thanks, Yi Pan (Data Infrastructure)

<    1   2   3   4   5   6   7   8   9   10   >