questions of partition and task of Samza

2015-10-21 Thread Selina Tech
Hi, All: In the Samza document, it mentioned "Each task consumes data from one partition for each of the job’s input streams." Does it mean if the data processing one job is not in one partition, the result will be wrong. Assuming my Samza input data on Kafka topic -- "input" is p

Re: How to do aggregation in Samza?

2015-10-21 Thread jeremy p
Wow, Calcite looks really really really cool! It's too bad this feature isn't further along. My current project calls for aggregation, and we don't have the resources to roll our own aggregation engine. Perhaps we'll go with Spark Streaming or Storm for now, and then switch back to Samza once st

Re: How to do aggregation in Samza?

2015-10-21 Thread Selina Tech
Hi, All: I have same question also. Previously, I thought we just use java code to implement, Counts, averages ... in Samza. Is anyone kenw any Java libraries? Sincerely, Selina On Wed, Oct 21, 2015 at 8:15 AM, jeremy p wrote: > Hey all, > > So, I'm wanting to do aggregate operations in Samza.

Samza processing reference data

2015-10-21 Thread Chen Song
In our samza app, we need to read data from MySQL (reference table) with a stream. So the requirements are * Read data into each Samza task before processing any message. * The Samza task should be able to listen to updates happening in MySQL. I did some research after scanning through some relev

Re: How to do aggregation in Samza?

2015-10-21 Thread Julian Hyde
I am helping with the SQL support. I don’t know timelines but I wanted to chime in on the different aggregate operations. There are several ways to aggregate streams: tumbling, hopping, sliding windows. For example, if you want to periodically emit totals that collapse many rows into one total,

How to do aggregation in Samza?

2015-10-21 Thread jeremy p
Hey all, So, I'm wanting to do aggregate operations in Samza. Counts, averages, grouping, things of that nature. Basically, the kinds of aggregate operations you can do in SQL. What's the best way to do this in Samza? Are there any libraries for this? I noticed a few projects are currently in

Need help in log4j.xml externalization

2015-10-21 Thread Patni, Ankush
Hello Team, I am trying to externalize log4j from my task. So at present I run all the task from one tar.gz. And inside that I have log4j.xml. But now I want to externalize the log4j.xml so that I can have more control over logs. So before running my task I tried to set the JAVA_OPTS: export