Re: order guarantee failure: better/best effort?

2012-12-14 Thread ben fleis
Hey Jay,  (yes, I just like the sound of that!)

I have a testing harness for my Node client, but I can't (at the moment)
put it out in the open.  If you'd like, I cab share it via LI channels and
get you into full simulation without too much pain, I think.

And yes, they are same partition, same socket.  I have the tcpdump logs, my
own console and producer logs and the raw kafka files stored away if that's
useful.  They are stored in wrong order on disk, so it's definitely on the
incoming side.

ben


Re: order guarantee failure: better/best effort?

2012-12-14 Thread Jay Kreps
That sounds like a bug. Don't worry about the script if it is proprietary, I 
can just write a simple java test harness. That would actually be a good test 
to run going forward.

Sent from my iPhone

On Dec 14, 2012, at 2:21 AM, ben fleis  wrote:

> Hey Jay,  (yes, I just like the sound of that!)
> 
> I have a testing harness for my Node client, but I can't (at the moment)
> put it out in the open.  If you'd like, I cab share it via LI channels and
> get you into full simulation without too much pain, I think.
> 
> And yes, they are same partition, same socket.  I have the tcpdump logs, my
> own console and producer logs and the raw kafka files stored away if that's
> useful.  They are stored in wrong order on disk, so it's definitely on the
> incoming side.
> 
> ben


[jira] [Updated] (KAFKA-616) Implement acks=0

2012-12-14 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-616:


Labels: newbie  (was: )

> Implement acks=0
> 
>
> Key: KAFKA-616
> URL: https://issues.apache.org/jira/browse/KAFKA-616
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Jay Kreps
>  Labels: newbie
>
> For completeness it would be nice to handle the case where acks=0 in the 
> produce request. The meaning of this would be that the broker immediately 
> responds without blocking even on the local write. The advantage of this is 
> that it would often isolate the producer from any latency in the local write 
> (which we have occasionally seen).
> Since we don't block on the append the response would contain a placeholder 
> for all the fields--e.g. offset=-1 and no error.
> This should be pretty easy to implement, just an if statement in 
> KafkaApis.handleProduceRequest to send the response immediately in this case 
> (and again to avoid sending a second response later).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (KAFKA-600) kafka should respond gracefully rather than crash when unable to write due to ENOSPC

2012-12-14 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps resolved KAFKA-600.
-

Resolution: Fixed

Since there don't seem to be any objections I am closing out this issue.

> kafka should respond gracefully rather than crash when unable to write due to 
> ENOSPC
> 
>
> Key: KAFKA-600
> URL: https://issues.apache.org/jira/browse/KAFKA-600
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Stilwell
>
> problem:
> user starts kafka with log.dir value set to a small partition and begins 
> writing data to the mq.  when the disk partition is full, kafka crashes.  
> given that this product is used for both reading and writing operations, 
> crashing seems rather drastic even if the error message is helpful.   
> something more robust would be appreciated.  perhaps, logging an error and 
> rejecting additional write requests while accepting additional read requests? 
>  perhaps, sending an email alert to Operations?  at least shutdown gracefully 
> so the user is aware that received messages were saved with a helpful message 
> providing some details of the last message received.  when tens or hundreds 
> of thousands of messages can be processed in a second, it isn't helpful to 
> merely log a timestamp and crash.
> steps to reproduce:
> 1) download and install kafka
> 2) modify server.properties
> # vi /opt/kafka-0.7.2-incubating-src/config/server.properties
> set log.dir="/var/log/kafka"
> 3) modify log4j
> # vi /opt/kafka-0.7.2-incubating-src/config/log4j.properties
> set fileAppender.File=/var/log/kafka/kafka-request.log
> 4) start kafka service
> $ sudo bash
> # ulimit -c unlimited
> # /opt/kafka-0.7.2-incubating-src/bin/kafka-server-start.sh 
> /opt/kafka-0.7.2-incubating-src/config/server.properties &
> 6) begin writing data to hostname:9092
> 7) review /var/log/kafka-request.log
> results:
> $ grep log.dir /opt/kafka-0.7.2-incubating-src/config/server.properties
> log.dir=/var/log/kafka
> $ df -h /var/log/kafka
> Filesystem  Size  Used Avail Use% Mounted on
> /dev/sda1   4.0G  4.0G 0 100% /
> $ tail /var/log/kafka/kafka-request.log
> 17627442 [ZkClient-EventThread-14-10.0.20.242:2181] INFO  
> kafka.server.KafkaZooKeeper  - Begin registering broker topic 
> /brokers/topics/raw/0 with 1 partitions
> 17627444 [ZkClient-EventThread-14-10.0.20.242:2181] INFO  
> kafka.server.KafkaZooKeeper  - End registering broker topic 
> /brokers/topics/raw/0
> 17627445 [ZkClient-EventThread-14-10.0.20.242:2181] INFO  
> kafka.server.KafkaZooKeeper  - done re-registering broker
> 18337676 [kafka-processor-3] ERROR kafka.network.Processor  - Closing socket 
> for /10.0.20.138 because of error
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:218)
> at sun.nio.ch.IOUtil.read(IOUtil.java:191)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359)
> at kafka.utils.Utils$.read(Utils.scala:538)
> at 
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> at kafka.network.Processor.read(SocketServer.scala:311)
> at kafka.network.Processor.run(SocketServer.scala:214)
> at java.lang.Thread.run(Thread.java:722)
> 18391974 [kafka-processor-4] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18422004 [kafka-processor-5] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18434563 [kafka-processor-6] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18485005 [kafka-processor-7] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18497083 [kafka-processor-0] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18525720 [kafka-processor-1] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18543843 [kafka-processor-2] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18563230 [kafka-processor-4] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18575613 [kafka-processor-5] INFO  kafka.network.Processor  - Closing socket 
> connection to /10.0.20.138.
> 18677568 [kafka-processor-6] ERROR kafka.network.Processor  - Closing socket 
> for /10.0.20.138 because of error
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.I

Re: EasyMock?

2012-12-14 Thread Derek Chen-Becker
Let me go ahead and put in some time in next week to get it running again.
I've only been skimming the emails lately, but is Kafka moved over to Git
yet? Using Git would allow me to avoid some of the issues we ran into
trying to submit an SVN patch due to deleted files, etc. Also, which
branch/revision should I be targeting?

Thanks,

Derek


On Wed, Dec 12, 2012 at 11:45 PM, Joe Stein  wrote:

> Hey Derek, I totally got swamped at work and didn't get a chance to bang on
> the sbt change.
>
> I am not a sbt whiz you might be better to get this over the finish line is
> it something you might be able to-do?
>
> if so I can review and commit, np
>
> IMHO 0.8 should not ship with out supporting latest sbt and scala in a good
> way it is one thing I have heard time and again from Scala community folks
> from perception, contributions, etc
>
> let me know if you don't have time I can figure some time to noodle and
> bang away again at it last time I got it all down to one test failing but
> some other tickets I should attend to also
>
> On Tue, Nov 20, 2012 at 4:21 PM, Derek Chen-Becker 
> wrote:
>
> > I haven't had any time to work on it in a few months, but it's definitely
> > something I'd like to finish up. I think the last remaining issue is that
> > SBT isn't integrated into the external packaging scripts, but otherwise
> it
> > should work.
> >
> > Derek
> >
> > On Mon, Nov 19, 2012 at 9:24 PM, Joe Stein  wrote:
> >
> > > I have not tried the KAFKA-139 ticket yet for latest scala but will
> give
> > > that a try this week. Looks cool.
> > >
> > > On Mon, Nov 19, 2012 at 7:11 PM, Derek Chen-Becker 
> > > wrote:
> > >
> > > > In particular it breaks some tests between Scala 2.8.0 and 2.9.x due
> to
> > > > changes in Scala's collections impls that impact Map traversal
> order. I
> > > > haven't had any time to work further on the SBT cross-build/upgrade
> > > ticket
> > > > (
> > > > https://issues.apache.org/jira/browse/KAFKA-139), but that was
> > > definitely
> > > > a
> > > > blocker.
> > > >
> > > > Derek
> > > >
> > > > On Mon, Nov 19, 2012 at 3:38 PM, Jay Kreps 
> > wrote:
> > > >
> > > > > What has people's experience with EasyMock been?
> > > > >
> > > > > I am really struggling with it. It seems to encourage reaching
> inside
> > > > > classes and programming an explicit list of calls they will make.
> > This
> > > > > makes the tests very fragile to internal changes.
> > > > >
> > > > > It also doesn't seem to play all that well with scala due to all
> the
> > > > > reflection.
> > > > >
> > > > > I was originally a big fan since it got us away from massive
> > > > > non-deterministic integration tests.
> > > > >
> > > > > But I wonder if we wouldn't be better off just writing simple
> > hand-made
> > > > > mocks for major classes or increasing the quality of our test
> > harnesses
> > > > to
> > > > > do integration testing...?
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > -Jay
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Derek Chen-Becker*
> > > > *Precog Lead Infrastructure Engineer*
> > > > de...@precog.com
> > > > 303-752-1700
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > /*
> > > Joe Stein
> > > http://www.linkedin.com/in/charmalloc
> > > Twitter: @allthingshadoop 
> > > */
> > >
> >
> >
> >
> > --
> > *Derek Chen-Becker*
> > *Precog Lead Infrastructure Engineer*
> > de...@precog.com
> > 303-752-1700
> >
>
>
>
> --
>
> /*
> Joe Stein
> http://www.linkedin.com/in/charmalloc
> Twitter: @allthingshadoop 
> */
>



-- 
*Derek Chen-Becker*
*Precog Lead Infrastructure Engineer*
de...@precog.com
303-752-1700


[jira] [Commented] (KAFKA-374) Move to java CRC32 implementation

2012-12-14 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532413#comment-13532413
 ] 

David Arthur commented on KAFKA-374:


Akka seems a bit overkill for this (although it does have some nice 
properties). It would be interesting to refactor the threading in Kafka with 
Akka and see what kind of performance differences there are (certainly beyond 
the scope of this JIRA).

As for the CRC implementation, is there consensus of what do here - Java or 
Scala?

I say +1 for Java since no one will need to modify this code and it doesn't 
really matter that it's not Scala.

> Move to java CRC32 implementation
> -
>
> Key: KAFKA-374
> URL: https://issues.apache.org/jira/browse/KAFKA-374
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 0.8
>Reporter: Jay Kreps
>Priority: Minor
>  Labels: newbie
> Attachments: KAFKA-374-draft.patch, KAFKA-374.patch
>
>
> We keep a per-record crc32. This is fairly cheap algorithm, but the java 
> implementation uses JNI and it seems to be a bit expensive for small records. 
> I have seen this before in Kafka profiles, and I noticed it on another 
> application I was working on. Basically with small records the native 
> implementation can only checksum < 100MB/sec. Hadoop has done some analysis 
> of this and replaced it with a Java implementation that is 2x faster for 
> large values and 5-10x faster for small values. Details are here HADOOP-6148.
> We should do a quick read/write benchmark on log and message set iteration 
> and see if this improves things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (KAFKA-628) System Test Failure Case 5005 (Mirror Maker bouncing) - Data Loss in ConsoleConsumer

2012-12-14 Thread John Fung (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Fung closed KAFKA-628.
---


> System Test Failure Case 5005 (Mirror Maker bouncing) - Data Loss in 
> ConsoleConsumer
> 
>
> Key: KAFKA-628
> URL: https://issues.apache.org/jira/browse/KAFKA-628
> Project: Kafka
>  Issue Type: Bug
>Reporter: John Fung
> Attachments: kafka-628-reproduce-issue.patch, 
> log4j_and_data_logs.tar.gz
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (KAFKA-628) System Test Failure Case 5005 (Mirror Maker bouncing) - Data Loss in ConsoleConsumer

2012-12-14 Thread John Fung (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Fung resolved KAFKA-628.
-

Resolution: Fixed

This issue is not showing any more in 0.8 branch. So mark this close now.

> System Test Failure Case 5005 (Mirror Maker bouncing) - Data Loss in 
> ConsoleConsumer
> 
>
> Key: KAFKA-628
> URL: https://issues.apache.org/jira/browse/KAFKA-628
> Project: Kafka
>  Issue Type: Bug
>Reporter: John Fung
> Attachments: kafka-628-reproduce-issue.patch, 
> log4j_and_data_logs.tar.gz
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


kafka repo moved to git

2012-12-14 Thread Jun Rao
Hi, Everyone,

We have moved Kafka repo to git. The location of the new repo is at:
https://git-wip-us.apache.org/repos/asf/kafka.git

Please follow the following instruction on how to use it.

https://git-wip-us.apache.org

Thanks,

Jun


Re: kafka repo moved to git

2012-12-14 Thread Jay Kreps
And as with apache svn, it sounds like you need to use https if you want a
writable version and http otherwise...

-Jay


On Fri, Dec 14, 2012 at 10:43 AM, Jun Rao  wrote:

> Hi, Everyone,
>
> We have moved Kafka repo to git. The location of the new repo is at:
> https://git-wip-us.apache.org/repos/asf/kafka.git
>
> Please follow the following instruction on how to use it.
>
> https://git-wip-us.apache.org
>
> Thanks,
>
> Jun
>


Re: kafka repo moved to git

2012-12-14 Thread Derek Chen-Becker
Awesome! I hope to have a pull request for updated SBT builds some time
next week :)


On Fri, Dec 14, 2012 at 12:50 PM, Jay Kreps  wrote:

> And as with apache svn, it sounds like you need to use https if you want a
> writable version and http otherwise...
>
> -Jay
>
>
> On Fri, Dec 14, 2012 at 10:43 AM, Jun Rao  wrote:
>
> > Hi, Everyone,
> >
> > We have moved Kafka repo to git. The location of the new repo is at:
> > https://git-wip-us.apache.org/repos/asf/kafka.git
> >
> > Please follow the following instruction on how to use it.
> >
> > https://git-wip-us.apache.org
> >
> > Thanks,
> >
> > Jun
> >
>



-- 
*Derek Chen-Becker*
*Precog Lead Infrastructure Engineer*
de...@precog.com
303-752-1700


Re: new website

2012-12-14 Thread Jay Kreps
Okay, I set up redirects. Since our old website was updated via some 20
minute rsync script and since ModRewrite is mind boggling it is kind of
hard to say whether people will actually be properly redirected to the new
site or whether I just destroyed the old site. We will see in 20 minutes.

What I put in was:
RedirectMatch 301 (.*) http://kafka.apache.org${1}

Seem right?

-Jay


On Fri, Dec 14, 2012 at 1:30 PM, Jay Kreps  wrote:

> http://kafka.apache.org/
>
> I will set up redirects from the old to the new.
>
> If folks could sanity check the links, email addresses, english, and so on
> that would be swell.
>
> -Jay
>


Re: new website

2012-12-14 Thread Johan Lundahl
Hi,

I ran FF linkchecker plugin on the new site and found a few things:

* api docs in the left menu link to 0.7.1 is that right or is there api
docs for 0.7.2?

* on
  http://kafka.apache.org/downloads.html
  the KEYS link: http://svn.apache.org/repos/asf/incubator/kafka/KEYS gives
404

* on
  http://kafka.apache.org/projects.html
  the "let us know" mailto is referring to kafka-...@incubator.apache.org

Another thing I noticed earlier is that some of the presentations in the
wiki at
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentationsare
no longer online, namely:
Kafka: A Distributed pub-sub messaging system, Neha Narkhede from LinkedIn,
ApacheCon 2011
Kafka: LinkedIn's open source distributed pub-sub system, Neha Narkhede
from LinkedIn, SVForum Software Architecture & Platform, SIG, July 2011


On Fri, Dec 14, 2012 at 11:04 PM, Jay Kreps  wrote:

> Okay, I set up redirects. Since our old website was updated via some 20
> minute rsync script and since ModRewrite is mind boggling it is kind of
> hard to say whether people will actually be properly redirected to the new
> site or whether I just destroyed the old site. We will see in 20 minutes.
>
> What I put in was:
> RedirectMatch 301 (.*) http://kafka.apache.org${1}
>
> Seem right?
>
> -Jay
>
>
> On Fri, Dec 14, 2012 at 1:30 PM, Jay Kreps  wrote:
>
> > http://kafka.apache.org/
> >
> > I will set up redirects from the old to the new.
> >
> > If folks could sanity check the links, email addresses, english, and so
> on
> > that would be swell.
> >
> > -Jay
> >
>


[jira] [Updated] (KAFKA-664) Kafka server threads die due to OOME during long running test

2012-12-14 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-664:
-

Attachment: KAFKA-664-v4.patch

All good points - here is v4 with those changes.

> Kafka server threads die due to OOME during long running test
> -
>
> Key: KAFKA-664
> URL: https://issues.apache.org/jira/browse/KAFKA-664
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8
>Reporter: Neha Narkhede
>Assignee: Jay Kreps
>Priority: Blocker
>  Labels: bugs
> Fix For: 0.8
>
> Attachments: kafka-664-draft-2.patch, kafka-664-draft.patch, 
> KAFKA-664-v3.patch, KAFKA-664-v4.patch, Screen Shot 2012-12-09 at 11.22.50 
> AM.png, Screen Shot 2012-12-09 at 11.23.09 AM.png, Screen Shot 2012-12-09 at 
> 11.31.29 AM.png, thread-dump.log, watchersForKey.png
>
>
> I set up a Kafka cluster with 5 brokers (JVM memory 512M) and set up a long 
> running producer process that sends data to 100s of partitions continuously 
> for ~15 hours. After ~4 hours of operation, few server threads (acceptor and 
> processor) exited due to OOME -
> [2012-12-07 08:24:44,355] ERROR OOME with size 1700161893 
> (kafka.network.BoundedByteBufferReceive)
> java.lang.OutOfMemoryError: Java heap space
> [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
> 'kafka-acceptor': (kafka.utils.Utils$)
> java.lang.OutOfMemoryError: Java heap space
> [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
> 'kafka-processor-9092-1': (kafka.utils.Utils$)
> java.lang.OutOfMemoryError: Java heap space
> [2012-12-07 08:24:46,344] INFO Unable to reconnect to ZooKeeper service, 
> session 0x13afd0753870103 has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
> [2012-12-07 08:24:46,344] INFO zookeeper state changed (Expired) 
> (org.I0Itec.zkclient.ZkClient)
> [2012-12-07 08:24:46,344] INFO Initiating client connection, 
> connectString=eat1-app309.corp:12913,eat1-app310.corp:12913,eat1-app311.corp:12913,eat1-app312.corp:12913,eat1-app313.corp:12913
>  sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@19202d69 
> (org.apache.zookeeper.ZooKeeper)
> [2012-12-07 08:24:55,702] ERROR OOME with size 2001040997 
> (kafka.network.BoundedByteBufferReceive)
> java.lang.OutOfMemoryError: Java heap space
> [2012-12-07 08:25:01,192] ERROR Uncaught exception in thread 
> 'kafka-request-handler-0': (kafka.utils.Utils$)
> java.lang.OutOfMemoryError: Java heap space
> [2012-12-07 08:25:08,739] INFO Opening socket connection to server 
> eat1-app311.corp/172.20.72.75:12913 (org.apache.zookeeper.ClientCnxn)
> [2012-12-07 08:25:14,221] INFO Socket connection established to 
> eat1-app311.corp/172.20.72.75:12913, initiating session 
> (org.apache.zookeeper.ClientCnxn)
> [2012-12-07 08:25:17,943] INFO Client session timed out, have not heard from 
> server in 3722ms for sessionid 0x0, closing socket connection and attempting 
> reconnect (org.apache.zookeeper.ClientCnxn)
> [2012-12-07 08:25:19,805] ERROR error in loggedRunnable (kafka.utils.Utils$)
> java.lang.OutOfMemoryError: Java heap space
> [2012-12-07 08:25:23,528] ERROR OOME with size 1853095936 
> (kafka.network.BoundedByteBufferReceive)
> java.lang.OutOfMemoryError: Java heap space
> It seems like it runs out of memory while trying to read the producer 
> request, but its unclear so far. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: EasyMock?

2012-12-14 Thread Derek Chen-Becker
The bug bit me, so reapplied my patch on the latest from the 0.8 branch. I
pushed it up to my Git repo:

https://github.com/dchenbecker/kafka-sbt

I got it back to the point where it compiles, tests, etc. The stumbling
block last time was packaging, so there's still a fair bit of work to do
there. I also ned to re-test cross builds with 2.9.1/2.9.2, since both of
those had issues with EasyMock (fix in this patched version). If you have a
github account I can add you to the repo if you want to use it as the
workspace. I'm not an apache committer so I can't put a branch there.

Derek


On Wed, Dec 12, 2012 at 11:45 PM, Joe Stein  wrote:

> Hey Derek, I totally got swamped at work and didn't get a chance to bang on
> the sbt change.
>
> I am not a sbt whiz you might be better to get this over the finish line is
> it something you might be able to-do?
>
> if so I can review and commit, np
>
> IMHO 0.8 should not ship with out supporting latest sbt and scala in a good
> way it is one thing I have heard time and again from Scala community folks
> from perception, contributions, etc
>
> let me know if you don't have time I can figure some time to noodle and
> bang away again at it last time I got it all down to one test failing but
> some other tickets I should attend to also
>
> On Tue, Nov 20, 2012 at 4:21 PM, Derek Chen-Becker 
> wrote:
>
> > I haven't had any time to work on it in a few months, but it's definitely
> > something I'd like to finish up. I think the last remaining issue is that
> > SBT isn't integrated into the external packaging scripts, but otherwise
> it
> > should work.
> >
> > Derek
> >
> > On Mon, Nov 19, 2012 at 9:24 PM, Joe Stein  wrote:
> >
> > > I have not tried the KAFKA-139 ticket yet for latest scala but will
> give
> > > that a try this week. Looks cool.
> > >
> > > On Mon, Nov 19, 2012 at 7:11 PM, Derek Chen-Becker 
> > > wrote:
> > >
> > > > In particular it breaks some tests between Scala 2.8.0 and 2.9.x due
> to
> > > > changes in Scala's collections impls that impact Map traversal
> order. I
> > > > haven't had any time to work further on the SBT cross-build/upgrade
> > > ticket
> > > > (
> > > > https://issues.apache.org/jira/browse/KAFKA-139), but that was
> > > definitely
> > > > a
> > > > blocker.
> > > >
> > > > Derek
> > > >
> > > > On Mon, Nov 19, 2012 at 3:38 PM, Jay Kreps 
> > wrote:
> > > >
> > > > > What has people's experience with EasyMock been?
> > > > >
> > > > > I am really struggling with it. It seems to encourage reaching
> inside
> > > > > classes and programming an explicit list of calls they will make.
> > This
> > > > > makes the tests very fragile to internal changes.
> > > > >
> > > > > It also doesn't seem to play all that well with scala due to all
> the
> > > > > reflection.
> > > > >
> > > > > I was originally a big fan since it got us away from massive
> > > > > non-deterministic integration tests.
> > > > >
> > > > > But I wonder if we wouldn't be better off just writing simple
> > hand-made
> > > > > mocks for major classes or increasing the quality of our test
> > harnesses
> > > > to
> > > > > do integration testing...?
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > -Jay
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Derek Chen-Becker*
> > > > *Precog Lead Infrastructure Engineer*
> > > > de...@precog.com
> > > > 303-752-1700
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > /*
> > > Joe Stein
> > > http://www.linkedin.com/in/charmalloc
> > > Twitter: @allthingshadoop 
> > > */
> > >
> >
> >
> >
> > --
> > *Derek Chen-Becker*
> > *Precog Lead Infrastructure Engineer*
> > de...@precog.com
> > 303-752-1700
> >
>
>
>
> --
>
> /*
> Joe Stein
> http://www.linkedin.com/in/charmalloc
> Twitter: @allthingshadoop 
> */
>



-- 
*Derek Chen-Becker*
*Precog Lead Infrastructure Engineer*
de...@precog.com
303-752-1700


Re: new website

2012-12-14 Thread David Arthur

On 12/14/12 4:30 PM, Jay Kreps wrote:

http://kafka.apache.org/

I will set up redirects from the old to the new.

If folks could sanity check the links, email addresses, english, and so on
that would be swell.

-Jay

Small nit - Flume has graduated and has a new webpage 
http://flume.apache.org/




[jira] [Updated] (KAFKA-598) decouple fetch size from max message size

2012-12-14 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-598:
-

Attachment: KAFKA-598-v3.patch

Quick overview of revised patch:

1 - Addressed your comment about the previous behavior in ConsumerIterator
  (good catch on that!) and the config defaults.
2 - Changed semantics of fetch size to max memory. Max mem is a long (as int
  would currently limit to 2G). The actual partition fetch size is checked
  for overflow (in which case it is set to Int.MaxValue).
3 - Also introduced a DeprecatedProperties convenience class that will be
  checked upon config verification. I added this because i think max.memory
  is a more meaningful config than fetch.size and we can use this to
  deprecate other configs if needed.
4 - The partition count is a volatile int - I chose that over a method only to
  avoid traversal (for each request) to determine the count.


> decouple fetch size from max message size
> -
>
> Key: KAFKA-598
> URL: https://issues.apache.org/jira/browse/KAFKA-598
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.8
>Reporter: Jun Rao
>Assignee: Joel Koshy
>Priority: Blocker
> Attachments: KAFKA-598-v1.patch, KAFKA-598-v2.patch, 
> KAFKA-598-v3.patch
>
>
> Currently, a consumer has to set fetch size larger than the max message size. 
> This increases the memory footprint on the consumer, especially when a large 
> number of topic/partition is subscribed. By decoupling the fetch size from 
> max message size, we can use a smaller fetch size for normal consumption and 
> when hitting a large message (hopefully rare), we automatically increase 
> fetch size to max message size temporarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-139) cross-compile multiple Scala versions

2012-12-14 Thread derek (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532884#comment-13532884
 ] 

derek commented on KAFKA-139:
-

Since we're on Git now, I've pushed a branch with the changes here:

https://github.com/dchenbecker/kafka-sbt

I haven't had a chance to look at Sam's patches yet, but I'll get those applied 
as well. The branch is taken from the latest 0.8 head, using SBT 0.12.1.

> cross-compile multiple Scala versions
> -
>
> Key: KAFKA-139
> URL: https://issues.apache.org/jira/browse/KAFKA-139
> Project: Kafka
>  Issue Type: Improvement
>  Components: packaging
>Affects Versions: 0.8
>Reporter: Chris Burroughs
>  Labels: build
> Fix For: 0.8
>
> Attachments: kafka-sbt0-11-3-0.8.patch, kafka-sbt0-11-3-0.8-v2.patch, 
> kafka-sbt0-11-3-0.8-v3.patch, kafka-sbt0-11-3-0.8-v4.patch, 
> kafka-sbt0-11-3-0.8-v5-smeder.patch, kafka-sbt0-11-3-0.8-v6-smeder.patch, 
> kafka-sbt0-11-3.patch
>
>
> Since scala does not maintain binary compatibly between versions, 
> organizations tend to have to move all of there code at the same time.  It 
> would thus be very helpful if we could cross build multiple scala versions.
> http://code.google.com/p/simple-build-tool/wiki/CrossBuild
> Unclear if this would require KAFKA-134 or just work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: CRC patch

2012-12-14 Thread David Arthur

+1 from me, though I did submit the patch
On 12/13/12 1:40 AM, Joe Stein wrote:

+1 on the code change looks alright, I prefer using the java version in
this case only because we are using an implementation from another project
and its a drop in ... no reason to change things without good reason, yup.

On Wed, Dec 12, 2012 at 11:55 PM, Jay Kreps  wrote:


This patch is pretty safe, I did a pretty serious test against the java
impl on millions of crcs. The code change is just a few lines. I would like
to get this on trunk. Review?

https://issues.apache.org/jira/browse/KAFKA-374

Also, do folks have a preference between the java and scala version?

-jay








Splitting client code

2012-12-14 Thread David Arthur
Seems to be a commonly discussed topic. What do people thing about this? 
Are there any plans for it in 0.8?


When I've done stuff like this in the past for projects, it's been a 
combination of reorganizing code and configuring a build tool like Ivy 
using sub-modules. Does sbt support modules in a similar fashion?


The counter argument, of course, is that Kafka is a small project with 
minimal dependencies, so a thin client artifact is not going to save you 
much. That said, a Java consumer/producer jar with no dependencies would 
be nice :)


-David


Re: Splitting client code

2012-12-14 Thread Jay Kreps
Definitely no plans in 0.8, we are trying to get it out the door so we
really want to change as little as possible.

We haven't had huge discussions around 0.9 yet, but some of us have had an
interest in simplifying the client implementation. To me this would mean
(1) move the consumer co-ordination to the broker so it is accessible in
all languages, (2) move to non-blocking I/O for the clients, and (3) refine
the producer and consumer apis (not the protocol, the interfaces). Taken
together these represent a rewrite of the scala clients.

It would be possible to go all-in and just do them in Java at the same
time. I agree that the client code would ideally be in java and have no
dependencies, though I don't know if that is worth the cost.

There are a couple minor benefits to Java clients: clearer stack traces,
smaller jar file, fewer dependencies, and (ironically)
better compatibility with other scala versions. I don't see these as big
wins. The big win of this would actually be a clean room java
implementation with zero dependencies back or forth to the server. Ideally
in a separate repository. This and being in another language would force us
to think as the client as fundamentally different from the server. This has
mentally been very difficult for the team, for whatever reason, in a shared
code base. Having a completely stand alone client would force this issue.

I have to say I don't see huge value in trying to split the existing scala
code into multiple project directories and jars. Trying to split the
existing code would likely just make a mess, as we have tried to share as
much code as possible to avoid duplication--so things like message set
definition, network code, and utilities are all shared.

The cost of a clean-room implementation would be large amounts of code
duplication, and a much more jaring experience for us developers. In
general programming in Java kind of sucks, and maintaining similar code
across two languages would be a real headache.

-Jay


On Fri, Dec 14, 2012 at 7:57 PM, David Arthur  wrote:

> Seems to be a commonly discussed topic. What do people thing about this?
> Are there any plans for it in 0.8?
>
> When I've done stuff like this in the past for projects, it's been a
> combination of reorganizing code and configuring a build tool like Ivy
> using sub-modules. Does sbt support modules in a similar fashion?
>
> The counter argument, of course, is that Kafka is a small project with
> minimal dependencies, so a thin client artifact is not going to save you
> much. That said, a Java consumer/producer jar with no dependencies would be
> nice :)
>
> -David
>