[jira] [Created] (KAFKA-6620) Documentation about "exactly_once" doesn't mention "transaction.state.log.min.isr"

2018-03-07 Thread Daniel Qian (JIRA)
Daniel Qian created KAFKA-6620:
--

 Summary: Documentation about "exactly_once" doesn't mention 
"transaction.state.log.min.isr" 
 Key: KAFKA-6620
 URL: https://issues.apache.org/jira/browse/KAFKA-6620
 Project: Kafka
  Issue Type: Bug
  Components: documentation, streams
Reporter: Daniel Qian


Documentation about "processing.guarantee" says:
{quote}The processing guarantee that should be used. Possible values are 
{{at_least_once}}(default) and {{exactly_once}}. Note that exactly-once 
processing requires a cluster of at least three brokers by default what is the 
recommended setting for production; *for development you can change this, by 
adjusting broker setting* 
`{color:#FF}*transaction.state.log.replication.factor*{color}`
{quote}
If one only set *transaction.state.log.replication.factor=1* but leave 
*transaction.state.log.min.isr* with default value (which is 2) the Streams 
Application will break.

Hope you guys modify the doc, thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5999) Offset Fetch Request

2018-03-07 Thread Zhao Weilong (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389387#comment-16389387
 ] 

Zhao Weilong commented on KAFKA-5999:
-

Sorry for late reply

On Wed, Mar 7, 2018 at 2:12 PM, Ewen Cheslack-Postava (JIRA)


> Offset Fetch Request
> 
>
> Key: KAFKA-5999
> URL: https://issues.apache.org/jira/browse/KAFKA-5999
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Zhao Weilong
>Assignee: Ewen Cheslack-Postava
>Priority: Major
>
> New kafka (found in 10.2.1) support new feature for all topic which is put 
> number of topics -1. (v2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5999) Offset Fetch Request

2018-03-07 Thread Zhao Weilong (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389402#comment-16389402
 ] 

Zhao Weilong commented on KAFKA-5999:
-

Hi Vahid Hashemian, Ewen Cheslack-Postava:

Sorry for late reply.

Currently, OffsetFetch provide fetch offset for all topic partitions.
This feature can be referred from:
org.apache.kafka.common.requests.OffsetFetchRequest

The detail is to set the topic number to be -1 in byte array.

I want this feature can be included in this documentation:
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

Please consider. Thanks!



> Offset Fetch Request
> 
>
> Key: KAFKA-5999
> URL: https://issues.apache.org/jira/browse/KAFKA-5999
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Zhao Weilong
>Assignee: Ewen Cheslack-Postava
>Priority: Major
>
> New kafka (found in 10.2.1) support new feature for all topic which is put 
> number of topics -1. (v2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6195) DNS alias support for secured connections

2018-03-07 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389585#comment-16389585
 ] 

Rajini Sivaram commented on KAFKA-6195:
---

[~jonathanskrzypek] Sorry for the delay, I was out till today. I had a look at 
the PR. I think it will break SSL hostname verification when IP addresses are 
used instead of hostnames. At the moment, we can create certificates with 
broker IP address in the SubjectAlternativeName. Clients using IP addresses in 
the bootstrap server list connect successfully since the address used for 
connection is the same as the address in the certificate. With the changes in 
the PR, clients will do reverse DNS lookup and match the hostname returned by 
the lookup against the IP address in the certificate and fail SSL handshake.

This is a useful feature and the code makes sense, but I think it needs to be 
optional. Is there some way we can specify that a reverse DNS lookup is 
required (eg. in the bootstrap server list itself as part of the URL)?

> DNS alias support for secured connections
> -
>
> Key: KAFKA-6195
> URL: https://issues.apache.org/jira/browse/KAFKA-6195
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Jonathan Skrzypek
>Priority: Major
>
> It seems clients can't use a dns alias in front of a secured Kafka cluster.
> So applications can only specify a list of hosts or IPs in bootstrap.servers 
> instead of an alias encompassing all cluster nodes.
> Using an alias in bootstrap.servers results in the following error : 
> javax.security.sasl.SaslException: An error: 
> (java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
> GSS initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Fail to create credential. (63) - No service creds)]) 
> occurred when evaluating SASL token received from the Kafka Broker. Kafka 
> Client will go to AUTH_FAILED state. [Caused by 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Fail to create 
> credential. (63) - No service creds)]]
> When using SASL/Kerberos authentication, the kafka server principal is of the 
> form kafka@kafka/broker1.hostname@example.com
> Kerberos requires that the hosts can be resolved by their FQDNs.
> During SASL handshake, the client will create a SASL token and then send it 
> to kafka for auth.
> But to create a SASL token the client first needs to be able to validate that 
> the broker's kerberos is a valid one.
> There are 3 potential options :
> 1. Creating a single kerberos principal not linked to a host but to an alias 
> and reference it in the broker jaas file.
> But I think the kerberos infrastructure would refuse to validate it, so the 
> SASL handshake would still fail
> 2. Modify the client bootstrap mechanism to detect whether bootstrap.servers 
> contains a dns alias. If it does, resolve and expand the alias to retrieve 
> all hostnames behind it and add them to the list of nodes.
> This could be done by modifying parseAndValidateAddresses() in ClientUtils
> 3. Add a cluster.alias parameter that would be handled by the logic above. 
> Having another parameter to avoid confusion on how bootstrap.servers works 
> behind the scene.
> Thoughts ?
> I would be happy to contribute the change for any of the options.
> I believe the ability to use a dns alias instead of static lists of brokers 
> would bring good deployment flexibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6195) DNS alias support for secured connections

2018-03-07 Thread Jonathan Skrzypek (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389736#comment-16389736
 ] 

Jonathan Skrzypek commented on KAFKA-6195:
--

Thanks for reviewing.

Yes, that makes sense !
So we could have an option like bootstrap.reverse.dns.lookup=true/false

Or we could also implement the 2 behaviours by default, where the client checks 
for security.protocol config.

If SSL then doesn't resolve and leave as is
If anything else or not specified, does reverse DNS lookup by calling 
getCanonicalHostname.

But that means behaviour is a bit hidden behind the scenes instead of expressed 
explicitely.

As a side note, the above means that users can't have a single SSL certificate 
with a domain name for a given cluster ?

 

> DNS alias support for secured connections
> -
>
> Key: KAFKA-6195
> URL: https://issues.apache.org/jira/browse/KAFKA-6195
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Jonathan Skrzypek
>Priority: Major
>
> It seems clients can't use a dns alias in front of a secured Kafka cluster.
> So applications can only specify a list of hosts or IPs in bootstrap.servers 
> instead of an alias encompassing all cluster nodes.
> Using an alias in bootstrap.servers results in the following error : 
> javax.security.sasl.SaslException: An error: 
> (java.security.PrivilegedActionException: javax.security.sasl.SaslException: 
> GSS initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Fail to create credential. (63) - No service creds)]) 
> occurred when evaluating SASL token received from the Kafka Broker. Kafka 
> Client will go to AUTH_FAILED state. [Caused by 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Fail to create 
> credential. (63) - No service creds)]]
> When using SASL/Kerberos authentication, the kafka server principal is of the 
> form kafka@kafka/broker1.hostname@example.com
> Kerberos requires that the hosts can be resolved by their FQDNs.
> During SASL handshake, the client will create a SASL token and then send it 
> to kafka for auth.
> But to create a SASL token the client first needs to be able to validate that 
> the broker's kerberos is a valid one.
> There are 3 potential options :
> 1. Creating a single kerberos principal not linked to a host but to an alias 
> and reference it in the broker jaas file.
> But I think the kerberos infrastructure would refuse to validate it, so the 
> SASL handshake would still fail
> 2. Modify the client bootstrap mechanism to detect whether bootstrap.servers 
> contains a dns alias. If it does, resolve and expand the alias to retrieve 
> all hostnames behind it and add them to the list of nodes.
> This could be done by modifying parseAndValidateAddresses() in ClientUtils
> 3. Add a cluster.alias parameter that would be handled by the logic above. 
> Having another parameter to avoid confusion on how bootstrap.servers works 
> behind the scene.
> Thoughts ?
> I would be happy to contribute the change for any of the options.
> I believe the ability to use a dns alias instead of static lists of brokers 
> would bring good deployment flexibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6614) kafka-streams to configure internal topics message.timestamp.type=CreateTime

2018-03-07 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-6614:
-
Labels: newbie  (was: )

> kafka-streams to configure internal topics message.timestamp.type=CreateTime
> 
>
> Key: KAFKA-6614
> URL: https://issues.apache.org/jira/browse/KAFKA-6614
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Dmitry Vsekhvalnov
>Priority: Minor
>  Labels: newbie
>
> After fixing KAFKA-4785 all internal topics using built-in 
> *RecordMetadataTimestampExtractor* to read timestamps.
> Which doesn't seem to work correctly out of box with kafka brokers configured 
> with *log.message.timestamp.type=LogAppendTime* when using custom message 
> timestamp extractor.
> Example use-case windowed grouping + aggregation on late data:
> {code:java}
> KTable, Long> summaries = in
>    .groupBy((key, value) -> ..)
>    .windowedBy(TimeWindows.of(TimeUnit.HOURS.toMillis(1l)))
>    .count();{code}
> when processing late events:
>  # custom timestamp extractor will pick up timestamp in the past from message 
> (let's say hour ago)
>  # re-partition topic during grouping phase will be written back to kafka 
> using timestamp from (1)
>  # kafka broker will ignore provided timestamp in (2) to favor ingestion time
>  # streams lib will read re-partitioned topic back with 
> RecordMetadataTimestampExtractor
>  # and will get ingestion timestamp (3), which usually close to "now"
>  # window start/end will be incorrectly set based on "now" instead of 
> original timestamp from payload
> Understand there are ways to configure per-topic timestamp type in kafka 
> brokers to solve this, but it will be really nice if kafka-streams library 
> can take care of it itself.
> To follow "least-surprise" principle.  If library relies on timestamp.type 
> for topic it manages it should enforce it.
> CC [~guozhang] based on user group email discussion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6127) Streams should never block infinitely

2018-03-07 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390105#comment-16390105
 ] 

Guozhang Wang commented on KAFKA-6127:
--

We are poll() with a timeout, but yes, admittedly today that timeout is not 
restrictedly respected since coordinator re-discover / re-join groups etc is 
not fully covered in that timeout. So I think the title may be changed to 
"Streams should not block unexpectedly" :P

> Streams should never block infinitely
> -
>
> Key: KAFKA-6127
> URL: https://issues.apache.org/jira/browse/KAFKA-6127
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Priority: Major
>
> Streams uses three consumer APIs that can block infinite: {{commitSync()}}, 
> {{committed()}}, and {{position()}}.
> If we block within one operation, the whole {{StreamThread}} would block, and 
> the instance does not make any progress, becomes unresponsive (for example, 
> {{KafkaStreams#close()}} suffers), and we also might drop out of the consumer 
> group.
> We might consider to use {{wakeup()}} calls to unblock those operations to 
> keep {{StreamThread}} in a responsive state.
> Note: there are discussion to add timeout to those calls, and thus, we could 
> get {{TimeoutExceptions}}. This would be easier to handle than using 
> {{wakeup()}}. Thus, we should keep an eye on those discussions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6127) Streams should never block infinitely

2018-03-07 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390108#comment-16390108
 ] 

Matthias J. Sax commented on KAFKA-6127:


It might be bad to block in poll() as well – if somebody wants to shutdown the 
application we should have a mechanism to interrupt if we are in a blocking 
call.

> Streams should never block infinitely
> -
>
> Key: KAFKA-6127
> URL: https://issues.apache.org/jira/browse/KAFKA-6127
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Priority: Major
>
> Streams uses three consumer APIs that can block infinite: {{commitSync()}}, 
> {{committed()}}, and {{position()}}.
> If we block within one operation, the whole {{StreamThread}} would block, and 
> the instance does not make any progress, becomes unresponsive (for example, 
> {{KafkaStreams#close()}} suffers), and we also might drop out of the consumer 
> group.
> We might consider to use {{wakeup()}} calls to unblock those operations to 
> keep {{StreamThread}} in a responsive state.
> Note: there are discussion to add timeout to those calls, and thus, we could 
> get {{TimeoutExceptions}}. This would be easier to handle than using 
> {{wakeup()}}. Thus, we should keep an eye on those discussions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6621) Update Streams docs with details for UncaughtExceptionHandler

2018-03-07 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-6621:
--

 Summary: Update Streams docs with details for 
UncaughtExceptionHandler
 Key: KAFKA-6621
 URL: https://issues.apache.org/jira/browse/KAFKA-6621
 Project: Kafka
  Issue Type: Improvement
  Components: documentation, streams
Reporter: Matthias J. Sax


We should update the docs to explain that calling {{KafkaStreams#close()}} 
within the UncaughtExceptionHandler-callback might result in a deadlock and one 
should always specify a timeout parameter for this case.

As an alternative to avoid the deadlock, setting a flag in the handler and 
calling {{#close}} outside of the callback should also work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6399) Consider reducing "max.poll.interval.ms" default for Kafka Streams

2018-03-07 Thread Khaireddine Rezgui (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390248#comment-16390248
 ] 

Khaireddine Rezgui commented on KAFKA-6399:
---

I agree with the value, 30 seconds seems to be sufficient

> Consider reducing "max.poll.interval.ms" default for Kafka Streams
> --
>
> Key: KAFKA-6399
> URL: https://issues.apache.org/jira/browse/KAFKA-6399
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Priority: Minor
>
> In Kafka {{0.10.2.1}} we change the default value of 
> {{max.poll.intervall.ms}} for Kafka Streams to {{Integer.MAX_VALUE}}. The 
> reason was that long state restore phases during rebalance could yield 
> "rebalance storms" as consumers drop out of a consumer group even if they are 
> healthy as they didn't call {{poll()}} during state restore phase.
> In version {{0.11}} and {{1.0}} the state restore logic was improved a lot 
> and thus, now Kafka Streams does call {{poll()}} even during restore phase. 
> Therefore, we might consider setting a smaller timeout for 
> {{max.poll.intervall.ms}} to detect bad behaving Kafka Streams applications 
> (ie, targeting user code) that don't make progress any more during regular 
> operations.
> The open question would be, what a good default might be. Maybe the actual 
> consumer default of 30 seconds might be sufficient. During one {{poll()}} 
> roundtrip, we would only call {{restoreConsumer.poll()}} once and restore a 
> single batch of records. This should take way less time than 30 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6400) Consider setting default cache size to zero in Kafka Streams

2018-03-07 Thread Khaireddine Rezgui (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390270#comment-16390270
 ] 

Khaireddine Rezgui commented on KAFKA-6400:
---

I got sometimes the same experience in my first using of kafka stream, i 
understand now the issue.

Does CACHE_MAX_BYTES_BUFFERING_CONFIG is the config mentioned in  the 
description ?

 

> Consider setting default cache size to zero in Kafka Streams
> 
>
> Key: KAFKA-6400
> URL: https://issues.apache.org/jira/browse/KAFKA-6400
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Priority: Minor
>
> Since the introduction of record caching in Kafka Streams DSL, we see regular 
> reports/questions of first times users about "Kafka Streams does not emit 
> anything" or "Kafka Streams loses messages". Those report are subject to 
> record caching but no bugs and indicate bad user experience.
> We might consider setting the default cache size to zero to avoid those 
> issues and improve the experience for first time users. This hold especially 
> for simple word-count-demos (Note, many people don't copy out example 
> word-count but build their own first demo app.)
> Remark: before we had caching, many users got confused about our update 
> semantics and that we emit an output record for each input record for 
> windowed aggregation (ie, please give me the "final" result"). Thus, we need 
> to consider this and judge with care to not go "forth and back" with default 
> user experience -- we did have less questions about this behavior lately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-6399) Consider reducing "max.poll.interval.ms" default for Kafka Streams

2018-03-07 Thread Khaireddine Rezgui (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khaireddine Rezgui reassigned KAFKA-6399:
-

Assignee: Khaireddine Rezgui

> Consider reducing "max.poll.interval.ms" default for Kafka Streams
> --
>
> Key: KAFKA-6399
> URL: https://issues.apache.org/jira/browse/KAFKA-6399
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Assignee: Khaireddine Rezgui
>Priority: Minor
>
> In Kafka {{0.10.2.1}} we change the default value of 
> {{max.poll.intervall.ms}} for Kafka Streams to {{Integer.MAX_VALUE}}. The 
> reason was that long state restore phases during rebalance could yield 
> "rebalance storms" as consumers drop out of a consumer group even if they are 
> healthy as they didn't call {{poll()}} during state restore phase.
> In version {{0.11}} and {{1.0}} the state restore logic was improved a lot 
> and thus, now Kafka Streams does call {{poll()}} even during restore phase. 
> Therefore, we might consider setting a smaller timeout for 
> {{max.poll.intervall.ms}} to detect bad behaving Kafka Streams applications 
> (ie, targeting user code) that don't make progress any more during regular 
> operations.
> The open question would be, what a good default might be. Maybe the actual 
> consumer default of 30 seconds might be sufficient. During one {{poll()}} 
> roundtrip, we would only call {{restoreConsumer.poll()}} once and restore a 
> single batch of records. This should take way less time than 30 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-6400) Consider setting default cache size to zero in Kafka Streams

2018-03-07 Thread Khaireddine Rezgui (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390270#comment-16390270
 ] 

Khaireddine Rezgui edited comment on KAFKA-6400 at 3/7/18 10:06 PM:


I got sometimes the same experience in my first using of kafka stream, i 
understand now the issue.

Does CACHE_MAX_BYTES_BUFFERING_CONFIG refer to the config mentioned in  the 
description ?

 


was (Author: khairy):
I got sometimes the same experience in my first using of kafka stream, i 
understand now the issue.

Does CACHE_MAX_BYTES_BUFFERING_CONFIG is the config mentioned in  the 
description ?

 

> Consider setting default cache size to zero in Kafka Streams
> 
>
> Key: KAFKA-6400
> URL: https://issues.apache.org/jira/browse/KAFKA-6400
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Priority: Minor
>
> Since the introduction of record caching in Kafka Streams DSL, we see regular 
> reports/questions of first times users about "Kafka Streams does not emit 
> anything" or "Kafka Streams loses messages". Those report are subject to 
> record caching but no bugs and indicate bad user experience.
> We might consider setting the default cache size to zero to avoid those 
> issues and improve the experience for first time users. This hold especially 
> for simple word-count-demos (Note, many people don't copy out example 
> word-count but build their own first demo app.)
> Remark: before we had caching, many users got confused about our update 
> semantics and that we emit an output record for each input record for 
> windowed aggregation (ie, please give me the "final" result"). Thus, we need 
> to consider this and judge with care to not go "forth and back" with default 
> user experience -- we did have less questions about this behavior lately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6530) Use actual first offset of messages when rolling log segment for magic v2

2018-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390468#comment-16390468
 ] 

ASF GitHub Bot commented on KAFKA-6530:
---

dhruvilshah3 opened a new pull request #4660: KAFKA-6530: Use actual first 
offset of message set when rolling log segment
URL: https://github.com/apache/kafka/pull/4660
 
 
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   Use the exact first offset of message set when rolling log segment. This is 
possible to do for message format V2 and beyond without any performance 
penalty, because we have the first offset stored in the header. This augments 
the fix made in KAFKA-4451 to avoid using the heuristic for V2 and beyond 
messages.
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   Added unit tests to simulate cases where segment needs to roll because of 
overflow in index offsets. Verified that the new segment created in these cases 
uses the first offset, instead of the heuristic in use previously.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use actual first offset of messages when rolling log segment for magic v2
> -
>
> Key: KAFKA-6530
> URL: https://issues.apache.org/jira/browse/KAFKA-6530
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Dhruvil Shah
>Priority: Major
>
> We've implemented a heuristic to avoid overflowing when rolling a log segment 
> to determine the base offset of the next segment without decompressing the 
> message set to find the actual first offset. With the v2 message format, we 
> can find the first offset without needing decompression, so we can set the 
> correct base offset exactly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6622) GroupMetadataManager.loadGroupsAndOffsets decompresses record batch needlessly

2018-03-07 Thread radai rosenblatt (JIRA)
radai rosenblatt created KAFKA-6622:
---

 Summary: GroupMetadataManager.loadGroupsAndOffsets decompresses 
record batch needlessly
 Key: KAFKA-6622
 URL: https://issues.apache.org/jira/browse/KAFKA-6622
 Project: Kafka
  Issue Type: Bug
Reporter: radai rosenblatt
Assignee: radai rosenblatt
 Attachments: kafka batch iteration funtime.png

when reading records from a consumer offsets batch, the entire batch is 
decompressed multiple times (per record) as part of calling `batch.baseOffset`. 
this is a very expensive operation being called in a loop for no reason:
!kafka batch iteration funtime.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6622) GroupMetadataManager.loadGroupsAndOffsets decompresses record batch needlessly

2018-03-07 Thread radai rosenblatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390502#comment-16390502
 ] 

radai rosenblatt commented on KAFKA-6622:
-

see PR - https://github.com/apache/kafka/pull/4661

> GroupMetadataManager.loadGroupsAndOffsets decompresses record batch needlessly
> --
>
> Key: KAFKA-6622
> URL: https://issues.apache.org/jira/browse/KAFKA-6622
> Project: Kafka
>  Issue Type: Bug
>Reporter: radai rosenblatt
>Assignee: radai rosenblatt
>Priority: Major
> Attachments: kafka batch iteration funtime.png
>
>
> when reading records from a consumer offsets batch, the entire batch is 
> decompressed multiple times (per record) as part of calling 
> `batch.baseOffset`. this is a very expensive operation being called in a loop 
> for no reason:
> !kafka batch iteration funtime.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5939) Add a dryrun option to release.py

2018-03-07 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390528#comment-16390528
 ] 

Ewen Cheslack-Postava commented on KAFKA-5939:
--

I think dry run is fine if we're clear about what it means. The scariest part 
of developing the script originally was the final steps around pushing tags. 
Otherwise most of it is only affecting stuff that's easy to clean up. To me, 
the most useful dry run would:
 * Still prompt about the steps that would upload, but just say what they would 
do
 * Say what tagging it would do, but skip it
 * Still do the full build and provide artifacts, which allows you to do some 
test run and validation without really generating an RC

I'm not sure how the publication to the maven repo would work as a dry-run, 
maybe just publish to a local directory.

The other thing which could possibly be handled here or could be a different 
Jira is to have a local copy of all artifacts when you've completed. This would 
apply to the regular release artifacts (it was a pain to pull them back down 
from my Kafka home directory in order to promote the release) and the maven 
artifacts (which is useful for doing validation, which should also be 
automated).

> Add a dryrun option to release.py
> -
>
> Key: KAFKA-5939
> URL: https://issues.apache.org/jira/browse/KAFKA-5939
> Project: Kafka
>  Issue Type: New Feature
>  Components: tools
>Reporter: Damian Guy
>Priority: Major
>
> It would be great to add a `dryrun` feature to `release.py` so that it can be 
> used to test changes to the scripts etc. At the moment you need to make sure 
> all JIRAs are closed for the release, have no uncommited changes etc, which 
> is a bit of a hassle when you just want to test a change you've made to the 
> script. There may be other things that need to be skipped, too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6264) Log cleaner thread may die on legacy segment containing messages whose offsets are too large

2018-03-07 Thread Dhruvil Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390535#comment-16390535
 ] 

Dhruvil Shah commented on KAFKA-6264:
-

[~becket_qin] would you mind if I picked this issue up?

> Log cleaner thread may die on legacy segment containing messages whose 
> offsets are too large
> 
>
> Key: KAFKA-6264
> URL: https://issues.apache.org/jira/browse/KAFKA-6264
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1, 1.0.0, 0.11.0.2
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Critical
> Fix For: 1.2.0
>
>
> We encountered a problem that some of the legacy log segments contains 
> messages whose offsets are larger than {{SegmentBaseOffset + Int.MaxValue}}.
> Prior to 0.10.2.0, we do not assert the offset of the messages when appending 
> them to the log segments. Due to KAFKA-5413, the log cleaner may append 
> messages whose offset is greater than {{base_offset + Int.MaxValue}} into the 
> segment during the log compaction.
> After the brokers are upgraded, those log segments cannot be compacted 
> anymore because the compaction will fail immediately due to the offset range 
> assertion we added to the LogSegment.
> We have seen this issue in the __consumer_offsets topic so it could be a 
> general problem. There is no easy solution for the users to recover from this 
> case. 
> One solution is to split such log segments in the log cleaner once it sees a 
> message with problematic offset and append those messages to a separate log 
> segment with a larger base_offset.
> Due to the impact of the issue. We may want to consider backporting the fix 
> to previous affected versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-6264) Log cleaner thread may die on legacy segment containing messages whose offsets are too large

2018-03-07 Thread Dhruvil Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390535#comment-16390535
 ] 

Dhruvil Shah edited comment on KAFKA-6264 at 3/8/18 12:57 AM:
--

[~becket_qin] if you don't have a working fix yet, would you mind if I picked 
this issue up?


was (Author: dhruvilshah):
[~becket_qin] would you mind if I picked this issue up?

> Log cleaner thread may die on legacy segment containing messages whose 
> offsets are too large
> 
>
> Key: KAFKA-6264
> URL: https://issues.apache.org/jira/browse/KAFKA-6264
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.1, 1.0.0, 0.11.0.2
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Critical
> Fix For: 1.2.0
>
>
> We encountered a problem that some of the legacy log segments contains 
> messages whose offsets are larger than {{SegmentBaseOffset + Int.MaxValue}}.
> Prior to 0.10.2.0, we do not assert the offset of the messages when appending 
> them to the log segments. Due to KAFKA-5413, the log cleaner may append 
> messages whose offset is greater than {{base_offset + Int.MaxValue}} into the 
> segment during the log compaction.
> After the brokers are upgraded, those log segments cannot be compacted 
> anymore because the compaction will fail immediately due to the offset range 
> assertion we added to the LogSegment.
> We have seen this issue in the __consumer_offsets topic so it could be a 
> general problem. There is no easy solution for the users to recover from this 
> case. 
> One solution is to split such log segments in the log cleaner once it sees a 
> message with problematic offset and append those messages to a separate log 
> segment with a larger base_offset.
> Due to the impact of the issue. We may want to consider backporting the fix 
> to previous affected versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6623) Consider renaming inefficient RecordBatch operations

2018-03-07 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-6623:
--

 Summary: Consider renaming inefficient RecordBatch operations
 Key: KAFKA-6623
 URL: https://issues.apache.org/jira/browse/KAFKA-6623
 Project: Kafka
  Issue Type: Improvement
Reporter: Jason Gustafson


Certain batch-level operations are only efficient with the new message format 
version. For example, {{RecordBatch.baseOffset}} requires decompression for the 
old message format versions. It is a bit too easy at the moment to overlook the 
performance implications when using these methods at the moment, which results 
in issues like KAFKA-6622. We should consider either renaming them to make the 
complexity more apparent or modify the API so that the old message format 
simply expose the option conveniently. A similar case is the record count, 
which is efficient in v2, but inefficient for older formats. We handled this 
case by exposing a {{countOrNull}} method which only returns the count for v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6413) ReassignPartitionsCommand#parsePartitionReassignmentData() should give better error message when JSON is malformed

2018-03-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KAFKA-6413:
--
Labels: json  (was: )

> ReassignPartitionsCommand#parsePartitionReassignmentData() should give better 
> error message when JSON is malformed
> --
>
> Key: KAFKA-6413
> URL: https://issues.apache.org/jira/browse/KAFKA-6413
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>  Labels: json
>
> In this thread: 
> http://search-hadoop.com/m/Kafka/uyzND1J9Hizcxo0X?subj=Partition+reassignment+data+file+is+empty
>  , Allen gave an example JSON string with extra comma where 
> partitionsToBeReassigned returned by 
> ReassignPartitionsCommand#parsePartitionReassignmentData() was empty.
> I tried the following example where a right bracket is removed:
> {code}
> val (partitionsToBeReassigned, replicaAssignment) = 
> ReassignPartitionsCommand.parsePartitionReassignmentData(
> 
> "{\"version\":1,\"partitions\":[{\"topic\":\"metrics\",\"partition\":0,\"replicas\":[1,2]},{\"topic\":\"metrics\",\"partition\":1,\"replicas\":[2,3]},}");
> {code}
> The returned partitionsToBeReassigned is empty (and no exception was thrown).
> The parser should give better error message for malformed JSON string.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6624) log segment deletion could cause a disk to be marked offline incorrectly

2018-03-07 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-6624:
--

 Summary: log segment deletion could cause a disk to be marked 
offline incorrectly
 Key: KAFKA-6624
 URL: https://issues.apache.org/jira/browse/KAFKA-6624
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 1.1.0
Reporter: Jun Rao


Saw the following log.

[2018-03-06 23:12:20,721] ERROR Error while flushing log for topic1-0 in dir 
/data01/kafka-logs with offset 80993 (kafka.server.LogDirFailureChannel)

java.nio.channels.ClosedChannelException

        at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)

        at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)

        at 
org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:163)

        at 
kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:375)

        at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)

        at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)

        at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)

        at kafka.log.LogSegment.flush(LogSegment.scala:374)

        at 
kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1374)

        at 
kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1373)

        at scala.collection.Iterator$class.foreach(Iterator.scala:891)

        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)

        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)

        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)

        at kafka.log.Log$$anonfun$flush$1.apply$mcV$sp(Log.scala:1373)

        at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)

        at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)

        at kafka.log.Log.maybeHandleIOException(Log.scala:1669)

        at kafka.log.Log.flush(Log.scala:1368)

        at 
kafka.log.Log$$anonfun$roll$2$$anonfun$apply$1.apply$mcV$sp(Log.scala:1343)

        at 
kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)

        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

[2018-03-06 23:12:20,722] INFO [ReplicaManager broker=0] Stopping serving 
replicas in dir /data01/kafka-logs (kafka.server.ReplicaManager)

It seems that topic1 was being deleted around the time when flushing was 
called. Then flushing hit an IOException, which caused the disk to be marked 
offline incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6624) log segment deletion could cause a disk to be marked offline incorrectly

2018-03-07 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390572#comment-16390572
 ] 

Jun Rao commented on KAFKA-6624:


One way to fix this is to add a flag for a log segment to be deleted. Then, if 
we hit an IOException during flushing, we could check if the deletion flag of 
the segment is set and if so, just ignore the IOException.

> log segment deletion could cause a disk to be marked offline incorrectly
> 
>
> Key: KAFKA-6624
> URL: https://issues.apache.org/jira/browse/KAFKA-6624
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Priority: Major
>
> Saw the following log.
> [2018-03-06 23:12:20,721] ERROR Error while flushing log for topic1-0 in dir 
> /data01/kafka-logs with offset 80993 (kafka.server.LogDirFailureChannel)
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)
>         at 
> org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:163)
>         at 
> kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:375)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>         at kafka.log.LogSegment.flush(LogSegment.scala:374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1373)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>         at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>         at kafka.log.Log$$anonfun$flush$1.apply$mcV$sp(Log.scala:1373)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
>         at kafka.log.Log.flush(Log.scala:1368)
>         at 
> kafka.log.Log$$anonfun$roll$2$$anonfun$apply$1.apply$mcV$sp(Log.scala:1343)
>         at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> [2018-03-06 23:12:20,722] INFO [ReplicaManager broker=0] Stopping serving 
> replicas in dir /data01/kafka-logs (kafka.server.ReplicaManager)
> It seems that topic1 was being deleted around the time when flushing was 
> called. Then flushing hit an IOException, which caused the disk to be marked 
> offline incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6619) InstanceAlreadyExistsException while Tomcat starting up

2018-03-07 Thread xiezhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiezhi updated KAFKA-6619:
--
Description: 
 I configured log4j to send application logs to kafka.

There is no more producer, one only. So I couldn't figure out what's happened.

log4j.properties-

log4j.rootLogger=INFO, kafka

#appender kafka
 log4j.appender.kafka=org.apache.kafka.log4jappender.KafkaLog4jAppender
 log4j.appender.kafka.topic=UAT_APP
 log4j.appender.A1.Threshold=INFO
 log4j.appender.kafka.syncSend=false

#multiple brokers are separated by comma ",".
 log4j.appender.kafka.brokerList=localhost:9091,localhost:9092,localhost:9093,
 log4j.appender.kafka.layout=org.apache.log4j.PatternLayout
 log4j.appender.kafka.layout.ConversionPattern=%d\{-MM-dd HH:mm:ss} %p %t 
%c (%F:%L) - %m%n

-end log4j.properties---

It's the error log below.

2018-03-07 14:54:57 INFO localhost-startStop-1 
org.apache.kafka.common.utils.AppInfoParser (AppInfoParser.java:109) - Kafka 
version : 1.0.0
 2018-03-07 14:54:57 INFO localhost-startStop-1 
org.apache.kafka.common.utils.AppInfoParser (AppInfoParser.java:110) - Kafka 
commitId : aaa7af6d4a11b29d
 2018-03-07 14:54:57 WARN localhost-startStop-1 
org.apache.kafka.common.utils.AppInfoParser (AppInfoParser.java:66) - Error 
registering AppInfo mbean
 javax.management.InstanceAlreadyExistsException: 
kafka.producer:type=app-info,id=producer-1
 at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
 at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
 at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
 at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
 at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
 at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
 at 
org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
 at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:427)
 at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:291)
 at 
org.apache.kafka.log4jappender.KafkaLog4jAppender.getKafkaProducer(KafkaLog4jAppender.java:246)
 at 
org.apache.kafka.log4jappender.KafkaLog4jAppender.activateOptions(KafkaLog4jAppender.java:240)
 at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
 at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
 at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
 at 
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)
 at 
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)
 at 
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615)
 at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502)
 at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)
 at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
 at org.apache.log4j.LogManager.(LogManager.java:127)
 at org.apache.log4j.Logger.getLogger(Logger.java:104)
 at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289)
 at org.apache.commons.logging.impl.Log4JLogger.(Log4JLogger.java:109)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at 
org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1040)
 at 
org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:838)
 at 
org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:601)
 at 
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:333)
 at 
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:307)
 at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:645)
 at 
org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:184)
 at 
org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:47)
 at 
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5118)
 at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5634)
 at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java

[jira] [Commented] (KAFKA-6624) log segment deletion could cause a disk to be marked offline incorrectly

2018-03-07 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390727#comment-16390727
 ] 

Jun Rao commented on KAFKA-6624:


[~lindong], do you think this is an issue? Thanks.

> log segment deletion could cause a disk to be marked offline incorrectly
> 
>
> Key: KAFKA-6624
> URL: https://issues.apache.org/jira/browse/KAFKA-6624
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Priority: Major
>
> Saw the following log.
> [2018-03-06 23:12:20,721] ERROR Error while flushing log for topic1-0 in dir 
> /data01/kafka-logs with offset 80993 (kafka.server.LogDirFailureChannel)
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)
>         at 
> org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:163)
>         at 
> kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:375)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>         at kafka.log.LogSegment.flush(LogSegment.scala:374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1373)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>         at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>         at kafka.log.Log$$anonfun$flush$1.apply$mcV$sp(Log.scala:1373)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
>         at kafka.log.Log.flush(Log.scala:1368)
>         at 
> kafka.log.Log$$anonfun$roll$2$$anonfun$apply$1.apply$mcV$sp(Log.scala:1343)
>         at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> [2018-03-06 23:12:20,722] INFO [ReplicaManager broker=0] Stopping serving 
> replicas in dir /data01/kafka-logs (kafka.server.ReplicaManager)
> It seems that topic1 was being deleted around the time when flushing was 
> called. Then flushing hit an IOException, which caused the disk to be marked 
> offline incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6624) log segment deletion could cause a disk to be marked offline incorrectly

2018-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390748#comment-16390748
 ] 

ASF GitHub Bot commented on KAFKA-6624:
---

lindong28 opened a new pull request #4663: KAFKA-6624; Prevent concurrent log 
flush and log deletion
URL: https://github.com/apache/kafka/pull/4663
 
 
   
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> log segment deletion could cause a disk to be marked offline incorrectly
> 
>
> Key: KAFKA-6624
> URL: https://issues.apache.org/jira/browse/KAFKA-6624
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Priority: Major
>
> Saw the following log.
> [2018-03-06 23:12:20,721] ERROR Error while flushing log for topic1-0 in dir 
> /data01/kafka-logs with offset 80993 (kafka.server.LogDirFailureChannel)
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)
>         at 
> org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:163)
>         at 
> kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:375)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>         at kafka.log.LogSegment.flush(LogSegment.scala:374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1373)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>         at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>         at kafka.log.Log$$anonfun$flush$1.apply$mcV$sp(Log.scala:1373)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
>         at kafka.log.Log.flush(Log.scala:1368)
>         at 
> kafka.log.Log$$anonfun$roll$2$$anonfun$apply$1.apply$mcV$sp(Log.scala:1343)
>         at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> [2018-03-06 23:12:20,722] INFO [ReplicaManager broker=0] Stopping serving 
> replicas in dir /data01/kafka-logs (kafka.server.ReplicaManager)
> It seems that topic1 was being deleted around the time when flushing was 
> called. Then flushing hit an IOException, which caused the disk to be marked 
> offline incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6624) log segment deletion could cause a disk to be marked offline incorrectly

2018-03-07 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390753#comment-16390753
 ] 

Dong Lin commented on KAFKA-6624:
-

[~junrao] It may be reasonable to grab the lock of the log during log.flush(). 
This will be consistent with the idea that all write/read operation of log 
segments will be synchronized using the per-log lock.

It will incur slightly higher overhead but personally I think it is OK. We will 
only flush the log segments since the previous flushed offset (i.e. recovery 
offset). Alternatively we can use the flag to optimize it. Not sure if the 
performance improvement with that optimization is worth the extra complexity.

Can you review 
[https://github.com/apache/kafka/pull/4663/files?|https://github.com/apache/kafka/pull/4663/files]

 

> log segment deletion could cause a disk to be marked offline incorrectly
> 
>
> Key: KAFKA-6624
> URL: https://issues.apache.org/jira/browse/KAFKA-6624
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Priority: Major
>
> Saw the following log.
> [2018-03-06 23:12:20,721] ERROR Error while flushing log for topic1-0 in dir 
> /data01/kafka-logs with offset 80993 (kafka.server.LogDirFailureChannel)
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
>         at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:379)
>         at 
> org.apache.kafka.common.record.FileRecords.flush(FileRecords.java:163)
>         at 
> kafka.log.LogSegment$$anonfun$flush$1.apply$mcV$sp(LogSegment.scala:375)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.log.LogSegment$$anonfun$flush$1.apply(LogSegment.scala:374)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>         at kafka.log.LogSegment.flush(LogSegment.scala:374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1374)
>         at 
> kafka.log.Log$$anonfun$flush$1$$anonfun$apply$mcV$sp$4.apply(Log.scala:1373)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>         at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>         at kafka.log.Log$$anonfun$flush$1.apply$mcV$sp(Log.scala:1373)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log$$anonfun$flush$1.apply(Log.scala:1368)
>         at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
>         at kafka.log.Log.flush(Log.scala:1368)
>         at 
> kafka.log.Log$$anonfun$roll$2$$anonfun$apply$1.apply$mcV$sp(Log.scala:1343)
>         at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>         at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> [2018-03-06 23:12:20,722] INFO [ReplicaManager broker=0] Stopping serving 
> replicas in dir /data01/kafka-logs (kafka.server.ReplicaManager)
> It seems that topic1 was being deleted around the time when flushing was 
> called. Then flushing hit an IOException, which caused the disk to be marked 
> offline incorrectly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6623) Consider renaming inefficient RecordBatch operations

2018-03-07 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-6623:
---
Description: Certain batch-level operations are only efficient with the new 
message format version. For example, {{RecordBatch.baseOffset}} requires 
decompression for the old message format versions. It is a bit too easy at the 
moment to overlook the performance implications when using these methods, which 
results in issues like KAFKA-6622. We should consider either renaming them to 
make the complexity more apparent or modify the API so that the old message 
format simply expose the option conveniently. A similar case is the record 
count, which is efficient in v2, but inefficient for older formats. We handled 
this case by exposing a {{countOrNull}} method which only returns the count for 
v2.  (was: Certain batch-level operations are only efficient with the new 
message format version. For example, {{RecordBatch.baseOffset}} requires 
decompression for the old message format versions. It is a bit too easy at the 
moment to overlook the performance implications when using these methods at the 
moment, which results in issues like KAFKA-6622. We should consider either 
renaming them to make the complexity more apparent or modify the API so that 
the old message format simply expose the option conveniently. A similar case is 
the record count, which is efficient in v2, but inefficient for older formats. 
We handled this case by exposing a {{countOrNull}} method which only returns 
the count for v2.)

> Consider renaming inefficient RecordBatch operations
> 
>
> Key: KAFKA-6623
> URL: https://issues.apache.org/jira/browse/KAFKA-6623
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>
> Certain batch-level operations are only efficient with the new message format 
> version. For example, {{RecordBatch.baseOffset}} requires decompression for 
> the old message format versions. It is a bit too easy at the moment to 
> overlook the performance implications when using these methods, which results 
> in issues like KAFKA-6622. We should consider either renaming them to make 
> the complexity more apparent or modify the API so that the old message format 
> simply expose the option conveniently. A similar case is the record count, 
> which is efficient in v2, but inefficient for older formats. We handled this 
> case by exposing a {{countOrNull}} method which only returns the count for v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6625) kafka offset reset when I upgrade kafka from 0.11.0 to 1.0.0 version

2018-03-07 Thread barry010 (JIRA)
barry010 created KAFKA-6625:
---

 Summary: kafka offset reset when I upgrade kafka from 0.11.0 to 
1.0.0 version
 Key: KAFKA-6625
 URL: https://issues.apache.org/jira/browse/KAFKA-6625
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.11.0.0
Reporter: barry010
 Attachments: consumer_offset-36.txt, flume.log.txt, server.log.txt

when I upgrade kafka from 0.11.0 to 1.0.0 version,during I shut down the 
broker01 with kill signal, update the code,and start the new broker01,I found 
that one of my consumer group consumerate rose to 10M/s。

And I got the information from __consumer_offsets-63 partition,I found the most 
of the partition's offset was reset in timestamp 1520316174188(2018/3/6 
14:2:54)。this timestamp is between I shut down the broker01 and start the new 
broker01。

I have not set the auto.offset.reset,and this topic only one consumer 
group,with flume

I put the three log ,consumer log,coordinator  log,__consumer_offsets info。

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6625) kafka offset reset when I upgrade kafka from 0.11.0 to 1.0.0 version

2018-03-07 Thread barry010 (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

barry010 updated KAFKA-6625:

Description: 
when I upgrade kafka from 0.11.0 to 1.0.0 version,during I shut down the 
broker01 with kill signal, update the code,and start the new broker01,I found 
that one of my consumer group consumerate rose to 10M/s。

And I got the information from __consumer_offsets-63 partition,I found the most 
of the partition's offset was reset in timestamp 1520316174188(2018/3/6 
14:2:54)。this timestamp is between I shut down the broker01 and start the new 
broker01。

I have not set the auto.offset.reset,and this topic only one consumer 
group,with flume

I put the three log ,consumer log,coordinator  log,one __consumer_offsets info。

 

  was:
when I upgrade kafka from 0.11.0 to 1.0.0 version,during I shut down the 
broker01 with kill signal, update the code,and start the new broker01,I found 
that one of my consumer group consumerate rose to 10M/s。

And I got the information from __consumer_offsets-63 partition,I found the most 
of the partition's offset was reset in timestamp 1520316174188(2018/3/6 
14:2:54)。this timestamp is between I shut down the broker01 and start the new 
broker01。

I have not set the auto.offset.reset,and this topic only one consumer 
group,with flume

I put the three log ,consumer log,coordinator  log,__consumer_offsets info。

 


> kafka offset reset when I upgrade kafka from 0.11.0 to 1.0.0 version
> 
>
> Key: KAFKA-6625
> URL: https://issues.apache.org/jira/browse/KAFKA-6625
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.0
>Reporter: barry010
>Priority: Critical
> Attachments: consumer_offset-36.txt, flume.log.txt, server.log.txt
>
>
> when I upgrade kafka from 0.11.0 to 1.0.0 version,during I shut down the 
> broker01 with kill signal, update the code,and start the new broker01,I found 
> that one of my consumer group consumerate rose to 10M/s。
> And I got the information from __consumer_offsets-63 partition,I found the 
> most of the partition's offset was reset in timestamp 1520316174188(2018/3/6 
> 14:2:54)。this timestamp is between I shut down the broker01 and start the new 
> broker01。
> I have not set the auto.offset.reset,and this topic only one consumer 
> group,with flume
> I put the three log ,consumer log,coordinator  log,one __consumer_offsets 
> info。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6625) kafka offset reset when I upgrade kafka from 0.11.0 to 1.0.0 version

2018-03-07 Thread barry010 (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

barry010 updated KAFKA-6625:

Environment: 
flume 1.8 with Kafka client version : 0.9.0.1
Kafka version : 0.11.0

> kafka offset reset when I upgrade kafka from 0.11.0 to 1.0.0 version
> 
>
> Key: KAFKA-6625
> URL: https://issues.apache.org/jira/browse/KAFKA-6625
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.0
> Environment: flume 1.8 with Kafka client version : 0.9.0.1
> Kafka version : 0.11.0
>Reporter: barry010
>Priority: Critical
> Attachments: consumer_offset-36.txt, flume.log.txt, server.log.txt
>
>
> when I upgrade kafka from 0.11.0 to 1.0.0 version,during I shut down the 
> broker01 with kill signal, update the code,and start the new broker01,I found 
> that one of my consumer group consumerate rose to 10M/s。
> And I got the information from __consumer_offsets-63 partition,I found the 
> most of the partition's offset was reset in timestamp 1520316174188(2018/3/6 
> 14:2:54)。this timestamp is between I shut down the broker01 and start the new 
> broker01。
> I have not set the auto.offset.reset,and this topic only one consumer 
> group,with flume
> I put the three log ,consumer log,coordinator  log,one __consumer_offsets 
> info。
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6626) Performance bottleneck in Kafka Connect sendRecords

2018-03-07 Thread JIRA
Maciej Bryński created KAFKA-6626:
-

 Summary: Performance bottleneck in Kafka Connect sendRecords
 Key: KAFKA-6626
 URL: https://issues.apache.org/jira/browse/KAFKA-6626
 Project: Kafka
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Maciej Bryński
 Attachments: MapPerf.java, image-2018-03-08-08-35-19-247.png

Kafka Connect is using IdentityHashMap for storing records.

[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L239]

Unfortunately this solution is very slow (2 times slower than normal HashMap / 
HashSet).

Benchmark result (code in attachment).
{code:java}
Identity 3977
Set 2442
Map 2207
Fast Set 2067
{code}
This problem is greatly slowing Kafka Connect.

!image-2018-03-08-08-35-19-247.png!

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-6626) Performance bottleneck in Kafka Connect sendRecords

2018-03-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KAFKA-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Bryński updated KAFKA-6626:
--
Description: 
Kafka Connect is using IdentityHashMap for storing records.

[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L239]

Unfortunately this solution is very slow (2 times slower than normal HashMap / 
HashSet).

Benchmark result (code in attachment).
{code:java}
Identity 4220
Set 2115
Map 1941
Fast Set 2121
{code}
This problem is greatly slowing Kafka Connect.

!image-2018-03-08-08-35-19-247.png!

 

  was:
Kafka Connect is using IdentityHashMap for storing records.

[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L239]

Unfortunately this solution is very slow (2 times slower than normal HashMap / 
HashSet).

Benchmark result (code in attachment).
{code:java}
Identity 3977
Set 2442
Map 2207
Fast Set 2067
{code}
This problem is greatly slowing Kafka Connect.

!image-2018-03-08-08-35-19-247.png!

 


> Performance bottleneck in Kafka Connect sendRecords
> ---
>
> Key: KAFKA-6626
> URL: https://issues.apache.org/jira/browse/KAFKA-6626
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Maciej Bryński
>Priority: Major
> Attachments: MapPerf.java, image-2018-03-08-08-35-19-247.png
>
>
> Kafka Connect is using IdentityHashMap for storing records.
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L239]
> Unfortunately this solution is very slow (2 times slower than normal HashMap 
> / HashSet).
> Benchmark result (code in attachment).
> {code:java}
> Identity 4220
> Set 2115
> Map 1941
> Fast Set 2121
> {code}
> This problem is greatly slowing Kafka Connect.
> !image-2018-03-08-08-35-19-247.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-1368) Upgrade log4j

2018-03-07 Thread Vladislav Pernin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390872#comment-16390872
 ] 

Vladislav Pernin commented on KAFKA-1368:
-

Log4j upgrade was needed in 2014 to be able to use more logging pattern only.

Anyway, using a generic facade like slf4j is probably a good idea.

> Upgrade log4j
> -
>
> Key: KAFKA-1368
> URL: https://issues.apache.org/jira/browse/KAFKA-1368
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Vladislav Pernin
>Assignee: Mickael Maison
>Priority: Major
>
> Upgrade log4j to at least 1.2.16 ou 1.2.17.
> Usage of EnhancedPatternLayout will be possible.
> It allows to set delimiters around the full log, stacktrace included, making 
> log messages collection easier with tools like Logstash.
> Example : <[%d{}]...[%t] %m%throwable>%n
> <[2014-04-08 11:07:20,360] ERROR [KafkaApi-1] Error when processing fetch 
> request for partition [X,6] offset 700 from consumer with correlation id 
> 0 (kafka.server.KafkaApis)
> kafka.common.OffsetOutOfRangeException: Request for offset 700 but we only 
> have log segments in the range 16021 to 16021.
> at kafka.log.Log.read(Log.scala:429)
> ...
> at java.lang.Thread.run(Thread.java:744)>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)