Re: Udf Performance and Object Creation

2015-08-13 Thread Hawin Jiang
Thanks Timo That is a good interview question Best regards Hawin On Thu, Aug 13, 2015 at 1:11 AM, Michael Huelfenhaus m.huelfenh...@davengo.com wrote: Hey Timo, yes that is what I needed to know. Thanks - Michael Am 12.08.2015 um 12:44 schrieb Timo Walther twal...@apache.org:

Re: FYI: Blog Post on Flink's Streaming Performance and Fault Tolerance

2015-08-05 Thread Hawin Jiang
Great job, Guys Let me read it carefully. On Wed, Aug 5, 2015 at 7:25 AM, Stephan Ewen se...@apache.org wrote: I forgot the link ;-) http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/ On Wed, Aug 5, 2015 at 4:11 PM, Stephan

Re: Sort Benchmark infrastructure

2015-07-16 Thread Hawin Jiang
, Dept. of Computer Science and Engineering Associate Director, UCSD Center for Networked Systems UC San Diego, La Jolla CA http://www.cs.ucsd.edu/~gmporter/ On Wed, Jul 15, 2015 at 1:44 AM, Hawin Jiang hawin.ji...@gmail.com wrote: Hi George and Mike Thanks for your information. Did you

Re: Building Big Data Benchmarking suite for Apache Flink

2015-07-13 Thread Hawin Jiang
Hi Slim I will follow this and keep you posted. Thanks. Best regards Hawin On Mon, Jul 13, 2015 at 7:04 PM, Slim Baltagi sbalt...@gmail.com wrote: Hi BigDataBench is an open source Big Data Benchmarking suite from both industry and academia. As a subset of BigDataBench,

Sort Benchmark infrastructure

2015-07-13 Thread Hawin Jiang
Hi Michael and George First of all, congratulation you guys have won the sort game again. We are coming from Flink community. I am not sure if it is possible to get your test environment to test our Flink for free. we saw that Apache spark did a good job as well. We want to challenge

Re: TeraSort on Flink and Spark

2015-07-12 Thread Hawin Jiang
Hi Kim and Stephan Kim's report is sorting 3360GB per 1427 seconds by Flink 0.9.0. 3360 = 80*42 ((80GB/per node and 42 nodes) Based on Kim's report. The TPS is 2.35GB/sec for Flink 0.9.0 Kim was using 42 nodes for testing purposes. I found that the best Spark performance result was using

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Hawin Jiang
Hi Stephan Yes. You are correct. It looks like the TPCx-HS is an industry standard for big data. But how to get a Flink number on that. I think it is also difficult to get a Spark performance number based on TPCx-HS. if you know someone can provide servers for performance testing. I would like

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-26 Thread Hawin Jiang
$producer$SyncProducer$$doSend(SyncProducer.scala:72) at kafka.producer.SyncProducer.send(SyncProducer.scala:113) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58) On Thu, Jun 25, 2015 at 11:06 PM, Hawin Jiang hawin.ji...@gmail.com wrote: Dear Marton I have upgraded my Flink

Re: Best way to write data to HDFS by Flink

2015-06-26 Thread Hawin Jiang
support for that, it is something you would need to implement yourself in a custom SinkFunction. Best, Marton On Mon, Jun 22, 2015 at 11:51 PM, Hawin Jiang hawin.ji...@gmail.com wrote: Hi Marton if we received a huge data from kafka and wrote to HDFS immediately. We should use buffer

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-26 Thread Hawin Jiang
. The one where you get this exception: org.apache.commons.lang3.SerializationException: java.io.StreamCorruptedException: invalid stream header: 68617769 Cheers, Aljoscha On Fri, 26 Jun 2015 at 08:21 Hawin Jiang hawin.ji...@gmail.com wrote: Dear Marton Here are some errors when I run

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-22 Thread Hawin Jiang
-with-dependencies.jar. You should be able to run the example form that. [1] https://github.com/mbalassi/flink-dataflow/blob/master/pom.xml#L286-296 On Thu, Jun 11, 2015 at 10:32 AM, Hawin Jiang hawin.ji...@gmail.com wrote: Dear Marton What do you meaning for locally Eclipse with 'Run'. Do you want to me

Re: Best way to write data to HDFS by Flink

2015-06-22 Thread Hawin Jiang
provided a similar partition API or configuration for this. Thanks. Best regards Hawin On Wed, Jun 10, 2015 at 10:31 AM, Hawin Jiang hawin.ji...@gmail.com wrote: Thanks Marton I will use this code to implement my testing. Best regards Hawin On Wed, Jun 10, 2015 at 1:30 AM, Márton Balassi

Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Hawin Jiang
Hi All I am preparing Kafka and Flink performance test now. In order to avoid my mistakes, I have downloaded Kafka example from http://kafka.apache.org/ and Flink streaming Kafka example from http://flink.apache.org I have run both producer examples on the same cluster. No any issues from

RE: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Hawin Jiang
? Is it from an IDE or submitting it to a flink cluster with bin/flink run? How do you define your dependencies, do you use maven or sbt for instance? Best, Marton On Thu, Jun 11, 2015 at 9:43 AM, Hawin Jiang hawin.ji...@gmail.com wrote:

Re: Kafka0.8.2.1 + Flink0.9.0 issue

2015-06-11 Thread Hawin Jiang
assembly:assembly'. Neither of these are beautiful but would help tracking down the root cause. On Thu, Jun 11, 2015 at 10:04 AM, Hawin Jiang hawin.ji...@gmail.com wrote: Dear Marton Thanks for supporting again. I am running these examples at the same project and I am using Eclipse IDE

Re: Best way to write data to HDFS by Flink

2015-06-10 Thread Hawin Jiang
://namenode_name:namenode_port/path/to/your/file); Check out the relevant section of the streaming docs for more info. [1] [1] http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#connecting-to-the-outside-world Best, Marton On Wed, Jun 10, 2015 at 10:22 AM, Hawin Jiang hawin.ji

Re: Apache Flink transactions

2015-06-09 Thread Hawin Jiang
On Mon, Jun 8, 2015 at 9:09 PM, Hawin Jiang hawin.ji...@gmail.com wrote: Hi Aljoscha I want to know what is the apache flink performance if I run the same SQL as below. Do you have any apache flink benchmark information? Such as: https://amplab.cs.berkeley.edu/benchmark/ Thanks. SELECT

Re: Apache Flink transactions

2015-06-09 Thread Hawin Jiang
( http://mail-archives.apache.org/mod_mbox/flink-user/201506.mbox/%3cd1972778.64426%25jspa...@cray.com%3e). He seems to be running some tests to compare Flink, Spark and MapReduce. Regards, Aljoscha On Mon, Jun 8, 2015 at 9:09 PM, Hawin Jiang hawin.ji...@gmail.com wrote: Hi Aljoscha I want

Best wishes for Kostas Tzoumas and Robert Metzger

2015-06-07 Thread Hawin Jiang
Hi All As you know that Kostas Tzoumas and Robert Metzger will give us two Flink talks on 2015 Hadoop summit. That is an excellent opportunity to introduce Apache Flink to the world. Best wishes for Kostas Tzoumas and Robert Metzger. Here is the details info: Topic: Apache

Re: Apache Flink transactions

2015-06-05 Thread Hawin Jiang
Thanks all Actually, I want to know more info about Flink SQL and Flink performance Here is the Spark benchmark. Maybe you already saw it before. https://amplab.cs.berkeley.edu/benchmark/ Thanks. Best regards Hawin On Fri, Jun 5, 2015 at 1:35 AM, Fabian Hueske fhue...@gmail.com wrote: If