Hdfs fSshell getmerge

2015-07-24 Thread Jan Filipiak

Hello hadoop users,

I have an idea about a small feature for the getmerge tool. I recently 
was in the need of using the new line option -nl because the files I 
needed to merge simply didn't had one.
I was merging all the files from one directory and unfortunately this 
directory also included empty files, which effectively led to multiple 
newlines append after some files.

I needed to remove them manually afterwards.

In this situation it is maybe good to have another argument that allows 
skipping empty files. I just wrote down 2 change one could try at the 
end. Do you guys consider this as a good improvement to the command line 
tools?


Thing one could try to implement this feature:

The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't 
return the number of bytes copied which would be convenient as one could 
skip append the new line when 0 bytes where copied

Or one would check the file size before.

Please let me know If you would consider this useful and is worth a 
feature ticket in Jira.


Thank you
Jan


Re: Hdfs fSshell getmerge

2015-07-24 Thread Jan Filipiak

Sorry wrong mailing list

On 24.07.2015 16:44, Jan Filipiak wrote:

Hello hadoop users,

I have an idea about a small feature for the getmerge tool. I recently 
was in the need of using the new line option -nl because the files I 
needed to merge simply didn't had one.
I was merging all the files from one directory and unfortunately this 
directory also included empty files, which effectively led to multiple 
newlines append after some files.

I needed to remove them manually afterwards.

In this situation it is maybe good to have another argument that 
allows skipping empty files. I just wrote down 2 change one could try 
at the end. Do you guys consider this as a good improvement to the 
command line tools?


Thing one could try to implement this feature:

The call for IOUtils.copyBytes(in, out, getConf(), false); doesn't 
return the number of bytes copied which would be convenient as one 
could skip append the new line when 0 bytes where copied

Or one would check the file size before.

Please let me know If you would consider this useful and is worth a 
feature ticket in Jira.


Thank you
Jan




Re: properducertest on multiple nodes

2015-07-24 Thread Gwen Shapira
Does topic speedx1 exist?

On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du yuheng.du.h...@gmail.com wrote:
 Hi,

 I am trying to run 20 performance test on 10 nodes using pbsdsh.

 The messages will send to a 6 brokers cluster. It seems to work for a
 while. When I delete the test queue and rerun the test, the broker does not
 seem to process incoming messages:

 [yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh
 org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 -1
 acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864
 batch.size=8196

 1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency,
 6.0 max latency.

 org.apache.kafka.common.errors.TimeoutException: Failed to update metadata
 after 79 ms.


 Can anyone suggest what the problem is? Or what configuration should I
 change? Thanks!


Re: New consumer - partitions auto assigned only on poll

2015-07-24 Thread Stevo Slavić
Hello Jason,

Thanks for feedback. I've created ticket for this
https://issues.apache.org/jira/browse/KAFKA-2359

Kind regards,
Stevo Slavic.

On Wed, Jul 22, 2015 at 6:18 PM, Jason Gustafson ja...@confluent.io wrote:

 Hey Stevo,

 That's a good point. I think the javadoc is pretty clear that this could
 return no partitions when the consumer has no active assignment, but it may
 be a little unintuitive to have to call poll() after subscribing before you
 can get the assigned partitions. I can't think of a strong reason not to go
 ahead with the assignment in subscriptions() other than to keep it
 non-blocking. Perhaps you can open a ticket and we can get feedback from
 some other devs?

 Thanks,
 Jason

 On Wed, Jul 22, 2015 at 2:09 AM, Stevo Slavić ssla...@gmail.com wrote:

  Hello Apache Kafka community,
 
  In the new consumer I encountered unexpected behavior. After constructing
  KafakConsumer instance with configured consumer rebalance callback
 handler,
  and subscribing to a topic with consumer.subscribe(topic), retrieving
  subscriptions would return empty set and callback handler would not get
  called (no partitions ever assigned or revoked), no matter how long
  instance was up.
 
  Then I found by inspecting KafkaConsumer code that partition assignment
  will only be triggered on first poll, pollOnce has:
 
  // ensure we have partitions assigned if we expect to
  if (subscriptions.partitionsAutoAssigned())
  coordinator.ensurePartitionAssignment();
 
  Would it make sense to include this fragment in
 KafkaConsumer.subscriptions
  accessor as well?
 
  Kind regards,
  Stevo Slavic.
 



Re: properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
I deleted the queue and recreated it before I run the test. Things are
working after restart the broker cluster, thanks!

On Fri, Jul 24, 2015 at 12:06 PM, Gwen Shapira gshap...@cloudera.com
wrote:

 Does topic speedx1 exist?

 On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:
  Hi,
 
  I am trying to run 20 performance test on 10 nodes using pbsdsh.
 
  The messages will send to a 6 brokers cluster. It seems to work for a
  while. When I delete the test queue and rerun the test, the broker does
 not
  seem to process incoming messages:
 
  [yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh
  org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100
 -1
  acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864
  batch.size=8196
 
  1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency,
  6.0 max latency.
 
  org.apache.kafka.common.errors.TimeoutException: Failed to update
 metadata
  after 79 ms.
 
 
  Can anyone suggest what the problem is? Or what configuration should I
  change? Thanks!



New consumer - offset one gets in poll is not offset one is supposed to commit

2015-07-24 Thread Stevo Slavić
Hello Apache Kafka community,

Say there is only one topic with single partition and a single message on
it.
Result of calling a poll with new consumer will return ConsumerRecord for
that message and it will have offset of 0.

After processing message, current KafkaConsumer implementation expects one
to commit not offset 0 as processed, but to commit offset 1 - next
offset/position one would like to consume.

Does this sound strange to you as well?

Wondering couldn't this offset+1 handling for next position to read been
done in one place, in KafkaConsumer implementation or broker or whatever,
instead of every user of KafkaConsumer having to do it.

Kind regards,
Stevo Slavic.


Log Deletion Behavior

2015-07-24 Thread JIEFU GONG
Hi all,

I have a few broad questions on how log deletion works, specifically in
conjunction with the log.retention.time setting. Say I published some
messages to some topics when the configuration was originally set to
something like log.retention.hours=168 (default). If I publish these
messages successfully, then later set the configuration to something like
log.retention.minutes=1, are those logs supposed to persist for the newest
settings or the old settings? Right now my logs are refusing to delete
themselves unless I specifically mark them for deletion -- is this the
correct/anticipated/wanted behavior?

Thanks for the help!

-- 

Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences

jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427


properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
Hi,

I am trying to run 20 performance test on 10 nodes using pbsdsh.

The messages will send to a 6 brokers cluster. It seems to work for a
while. When I delete the test queue and rerun the test, the broker does not
seem to process incoming messages:

[yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 -1
acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864
batch.size=8196

1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency,
6.0 max latency.

org.apache.kafka.common.errors.TimeoutException: Failed to update metadata
after 79 ms.


Can anyone suggest what the problem is? Or what configuration should I
change? Thanks!


deleting data automatically

2015-07-24 Thread Yuheng Du
Hi,

I am testing the kafka producer performance. So I created a queue and
writes a large amount of data to that queue.

Is there a way to delete the data automatically after some time, say
whenever the data size reaches 50GB or the retention time exceeds 10
seconds, it will be deleted so my disk won't get filled and new data can't
be written in?

Thanks.!


Changing error codes from the simple consumer?

2015-07-24 Thread Coolbeth, Matthew
I am having a strange interaction with the simple consumer API in Kafka 0.8.2.

Running the following code:


public FetchedMessageList send(String topic, int partition, long offset, int 
fetchSize) throws KafkaException {
try {
FetchedMessageList results = new FetchedMessageList();
FetchRequest req = new FetchRequestBuilder()
.clientId(clientId)
.addFetch(topic, partition, offset, fetchSize)
.minBytes(1)
.maxWait(250)
.build();
FetchResponse resp = consumer.fetch(req);
if(resp.hasError()) {
int code =  resp.errorCode(topic, partition);
if (code == 9) {
logger.warn(Fetch response contained error code: {}, 
resp.errorCode(topic, partition));
} else {
throw new RuntimeException(String.format(Fetch response 
contained error code %d, code));
}
}

Following a leader election, the FetchResponse has an error, and trips the 
(code == 9) condition, causing the program to emit a message via logger.warn.
However, the emitted log message reads “Fetch response contained error code: 6”.
I looked over the Kafka source code and don’t see how resp.errorCode might 
return a different value the second time.
Has anyone seen this before?

Thanks,

Matt Coolbeth


Re: Log Deletion Behavior

2015-07-24 Thread Mayuresh Gharat
No. This should not happen. At Linkedin we just use the log retention
hours. Try using that. Chang e it and bounce the broker. It should work.
Also looking back at the config's I am not sure why we had 3 different
configs for the same property :

log.retention.ms
log.retention.minutes
log.retention.hours

We should probably be having just the milliseconds.

Thanks,

Mayuresh

On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Hi all,

 I have a few broad questions on how log deletion works, specifically in
 conjunction with the log.retention.time setting. Say I published some
 messages to some topics when the configuration was originally set to
 something like log.retention.hours=168 (default). If I publish these
 messages successfully, then later set the configuration to something like
 log.retention.minutes=1, are those logs supposed to persist for the newest
 settings or the old settings? Right now my logs are refusing to delete
 themselves unless I specifically mark them for deletion -- is this the
 correct/anticipated/wanted behavior?

 Thanks for the help!

 --

 Jiefu Gong
 University of California, Berkeley | Class of 2017
 B.A Computer Science | College of Letters and Sciences

 jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


error while high level consumer

2015-07-24 Thread Kris K
Hi,

I started seeing these errors in the logs continuously when I try to bring
the High Level Consumer up. Please help.


ZookeeperConsumerConnector [INFO] [XXX], waiting for the partition
ownership to be deleted: 1

ZookeeperConsumerConnector [INFO] [XXX], end rebalancing
consumer XXX try #0

ZookeeperConsumerConnector [INFO] [XXX], Rebalancing attempt failed.
Clearing the cache before the next rebalancing operation is triggered


Thanks,

Kris


Re: Log Deletion Behavior

2015-07-24 Thread Mayuresh Gharat
To add on, the main thing here is you should be using only one of these
properties.

Thanks,

Mayuresh

On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat gharatmayures...@gmail.com
 wrote:

 Yes. It should. Do not set other retention settings. Just use the hours
 settings.
 Let me know about this :)

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Mayuresh, thanks for your comment. I won't be able to change these
 settings
 until next Monday, but just so confirm you are saying that if I restart
 the
 brokers my logs should delete themselves with respect to the newest
 settings, correct?
 ᐧ

 On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com
  wrote:

  No. This should not happen. At Linkedin we just use the log retention
  hours. Try using that. Chang e it and bounce the broker. It should work.
  Also looking back at the config's I am not sure why we had 3 different
  configs for the same property :
 
  log.retention.ms
  log.retention.minutes
  log.retention.hours
 
  We should probably be having just the milliseconds.
 
  Thanks,
 
  Mayuresh
 
  On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
   Hi all,
  
   I have a few broad questions on how log deletion works, specifically
 in
   conjunction with the log.retention.time setting. Say I published some
   messages to some topics when the configuration was originally set to
   something like log.retention.hours=168 (default). If I publish these
   messages successfully, then later set the configuration to something
 like
   log.retention.minutes=1, are those logs supposed to persist for the
  newest
   settings or the old settings? Right now my logs are refusing to delete
   themselves unless I specifically mark them for deletion -- is this the
   correct/anticipated/wanted behavior?
  
   Thanks for the help!
  
   --
  
   Jiefu Gong
   University of California, Berkeley | Class of 2017
   B.A Computer Science | College of Letters and Sciences
  
   jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
  
 
 
 
  --
  -Regards,
  Mayuresh R. Gharat
  (862) 250-7125
 



 --

 Jiefu Gong
 University of California, Berkeley | Class of 2017
 B.A Computer Science | College of Letters and Sciences

 jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427




 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Log Deletion Behavior

2015-07-24 Thread Grant Henke
Also this stackoverflow answer may help: http://stackoverflow.com/a/29672325

On Fri, Jul 24, 2015 at 9:36 PM, Grant Henke ghe...@cloudera.com wrote:

 I would actually suggest only using the ms versions of the retention
 config. Be sure to check/set all the configs below and look at the
 documented explanations here
 http://kafka.apache.org/documentation.html#brokerconfigs.

 I am guessing the default log.retention.check.interval.ms of 5 minutes is
 too long for your case or you may have changed from the default
 log.cleanup.policy and it is no longer set to delete.

 log.retention.ms
 log.retention.check.interval.ms
 log.cleanup.policy

 On Fri, Jul 24, 2015 at 9:14 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com wrote:

 To add on, the main thing here is you should be using only one of these
 properties.

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com
  wrote:

  Yes. It should. Do not set other retention settings. Just use the
 hours
  settings.
  Let me know about this :)
 
  Thanks,
 
  Mayuresh
 
  On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
  Mayuresh, thanks for your comment. I won't be able to change these
  settings
  until next Monday, but just so confirm you are saying that if I restart
  the
  brokers my logs should delete themselves with respect to the newest
  settings, correct?
  ᐧ
 
  On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
  gharatmayures...@gmail.com
   wrote:
 
   No. This should not happen. At Linkedin we just use the log retention
   hours. Try using that. Chang e it and bounce the broker. It should
 work.
   Also looking back at the config's I am not sure why we had 3
 different
   configs for the same property :
  
   log.retention.ms
   log.retention.minutes
   log.retention.hours
  
   We should probably be having just the milliseconds.
  
   Thanks,
  
   Mayuresh
  
   On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu
 wrote:
  
Hi all,
   
I have a few broad questions on how log deletion works,
 specifically
  in
conjunction with the log.retention.time setting. Say I published
 some
messages to some topics when the configuration was originally set
 to
something like log.retention.hours=168 (default). If I publish
 these
messages successfully, then later set the configuration to
 something
  like
log.retention.minutes=1, are those logs supposed to persist for the
   newest
settings or the old settings? Right now my logs are refusing to
 delete
themselves unless I specifically mark them for deletion -- is this
 the
correct/anticipated/wanted behavior?
   
Thanks for the help!
   
--
   
Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences
   
jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
   
  
  
  
   --
   -Regards,
   Mayuresh R. Gharat
   (862) 250-7125
  
 
 
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 
 
 
 
  --
  -Regards,
  Mayuresh R. Gharat
  (862) 250-7125
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




 --
 Grant Henke
 Solutions Consultant | Cloudera
 ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: deleting data automatically

2015-07-24 Thread gharatmayuresh15
You can configure that in the Configs by setting log retention :

http://kafka.apache.org/07/configuration.html

Thanks,

Mayuresh

Sent from my iPhone

 On Jul 24, 2015, at 12:49 PM, Yuheng Du yuheng.du.h...@gmail.com wrote:
 
 Hi,
 
 I am testing the kafka producer performance. So I created a queue and
 writes a large amount of data to that queue.
 
 Is there a way to delete the data automatically after some time, say
 whenever the data size reaches 50GB or the retention time exceeds 10
 seconds, it will be deleted so my disk won't get filled and new data can't
 be written in?
 
 Thanks.!


Any tool to easily fetch a single message or a few messages starting from a given offset then exit after fetching said count of messages?

2015-07-24 Thread David Luu
Hi,

I notice the kafka-console-consumer.sh script has option to fetch a max #
of messages, which can be 1 or more, and then exit. Which is nice. But as a
high level consumer, it's missing option to fetch from given offset other
than earliest  latest offsets.

Is there any off the shelf tool (CLI, simple script, etc.) that I can use
to fetch 1 or more messages from a given offset of some topic as a
consumer, then exit? Or do I have to write my own tool to do this from any
of the kafka libraries in various languages? If the later, is there any
where I can whip one up in fewest lines possible? I don't want to have to
read up and spend time crafting up a tool for this if I don't have to.

Wish this feature was already built in to the kafka toolchain.

Regards,
David


Re: Log Deletion Behavior

2015-07-24 Thread JIEFU GONG
Okay, I will look into only using the hours setting, but I think that means
the minimum a log can be stored is 1 hour right? I think last time I tried
there Kafka failed to parse decimals.
ᐧ

On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat gharatmayures...@gmail.com
 wrote:

 Yes. It should. Do not set other retention settings. Just use the hours
 settings.
 Let me know about this :)

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:

  Mayuresh, thanks for your comment. I won't be able to change these
 settings
  until next Monday, but just so confirm you are saying that if I restart
 the
  brokers my logs should delete themselves with respect to the newest
  settings, correct?
  ᐧ
 
  On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
  gharatmayures...@gmail.com
   wrote:
 
   No. This should not happen. At Linkedin we just use the log retention
   hours. Try using that. Chang e it and bounce the broker. It should
 work.
   Also looking back at the config's I am not sure why we had 3 different
   configs for the same property :
  
   log.retention.ms
   log.retention.minutes
   log.retention.hours
  
   We should probably be having just the milliseconds.
  
   Thanks,
  
   Mayuresh
  
   On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu
 wrote:
  
Hi all,
   
I have a few broad questions on how log deletion works, specifically
 in
conjunction with the log.retention.time setting. Say I published some
messages to some topics when the configuration was originally set to
something like log.retention.hours=168 (default). If I publish these
messages successfully, then later set the configuration to something
  like
log.retention.minutes=1, are those logs supposed to persist for the
   newest
settings or the old settings? Right now my logs are refusing to
 delete
themselves unless I specifically mark them for deletion -- is this
 the
correct/anticipated/wanted behavior?
   
Thanks for the help!
   
--
   
Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences
   
jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
   
  
  
  
   --
   -Regards,
   Mayuresh R. Gharat
   (862) 250-7125
  
 
 
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




-- 

Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences

jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427


Re: deleting data automatically

2015-07-24 Thread Ewen Cheslack-Postava
You'll want to set the log retention policy via
log.retention.{ms,minutes,hours} or log.retention.bytes. If you want really
aggressive collection (e.g., on the order of seconds, as you specified),
you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and
log.retention.check.interval.ms.

On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du yuheng.du.h...@gmail.com
wrote:

 Hi,

 I am testing the kafka producer performance. So I created a queue and
 writes a large amount of data to that queue.

 Is there a way to delete the data automatically after some time, say
 whenever the data size reaches 50GB or the retention time exceeds 10
 seconds, it will be deleted so my disk won't get filled and new data can't
 be written in?

 Thanks.!




-- 
Thanks,
Ewen


Re: Log Deletion Behavior

2015-07-24 Thread JIEFU GONG
Mayuresh, thanks for your comment. I won't be able to change these settings
until next Monday, but just so confirm you are saying that if I restart the
brokers my logs should delete themselves with respect to the newest
settings, correct?
ᐧ

On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat gharatmayures...@gmail.com
 wrote:

 No. This should not happen. At Linkedin we just use the log retention
 hours. Try using that. Chang e it and bounce the broker. It should work.
 Also looking back at the config's I am not sure why we had 3 different
 configs for the same property :

 log.retention.ms
 log.retention.minutes
 log.retention.hours

 We should probably be having just the milliseconds.

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu wrote:

  Hi all,
 
  I have a few broad questions on how log deletion works, specifically in
  conjunction with the log.retention.time setting. Say I published some
  messages to some topics when the configuration was originally set to
  something like log.retention.hours=168 (default). If I publish these
  messages successfully, then later set the configuration to something like
  log.retention.minutes=1, are those logs supposed to persist for the
 newest
  settings or the old settings? Right now my logs are refusing to delete
  themselves unless I specifically mark them for deletion -- is this the
  correct/anticipated/wanted behavior?
 
  Thanks for the help!
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




-- 

Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences

jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427


Re: Log Deletion Behavior

2015-07-24 Thread Mayuresh Gharat
Yes. It should. Do not set other retention settings. Just use the hours
settings.
Let me know about this :)

Thanks,

Mayuresh

On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Mayuresh, thanks for your comment. I won't be able to change these settings
 until next Monday, but just so confirm you are saying that if I restart the
 brokers my logs should delete themselves with respect to the newest
 settings, correct?
 ᐧ

 On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com
  wrote:

  No. This should not happen. At Linkedin we just use the log retention
  hours. Try using that. Chang e it and bounce the broker. It should work.
  Also looking back at the config's I am not sure why we had 3 different
  configs for the same property :
 
  log.retention.ms
  log.retention.minutes
  log.retention.hours
 
  We should probably be having just the milliseconds.
 
  Thanks,
 
  Mayuresh
 
  On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
   Hi all,
  
   I have a few broad questions on how log deletion works, specifically in
   conjunction with the log.retention.time setting. Say I published some
   messages to some topics when the configuration was originally set to
   something like log.retention.hours=168 (default). If I publish these
   messages successfully, then later set the configuration to something
 like
   log.retention.minutes=1, are those logs supposed to persist for the
  newest
   settings or the old settings? Right now my logs are refusing to delete
   themselves unless I specifically mark them for deletion -- is this the
   correct/anticipated/wanted behavior?
  
   Thanks for the help!
  
   --
  
   Jiefu Gong
   University of California, Berkeley | Class of 2017
   B.A Computer Science | College of Letters and Sciences
  
   jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
  
 
 
 
  --
  -Regards,
  Mayuresh R. Gharat
  (862) 250-7125
 



 --

 Jiefu Gong
 University of California, Berkeley | Class of 2017
 B.A Computer Science | College of Letters and Sciences

 jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427




-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Log Deletion Behavior

2015-07-24 Thread Grant Henke
I would actually suggest only using the ms versions of the retention
config. Be sure to check/set all the configs below and look at the
documented explanations here
http://kafka.apache.org/documentation.html#brokerconfigs.

I am guessing the default log.retention.check.interval.ms of 5 minutes is
too long for your case or you may have changed from the default
log.cleanup.policy and it is no longer set to delete.

log.retention.ms
log.retention.check.interval.ms
log.cleanup.policy

On Fri, Jul 24, 2015 at 9:14 PM, Mayuresh Gharat gharatmayures...@gmail.com
 wrote:

 To add on, the main thing here is you should be using only one of these
 properties.

 Thanks,

 Mayuresh

 On Fri, Jul 24, 2015 at 6:47 PM, Mayuresh Gharat 
 gharatmayures...@gmail.com
  wrote:

  Yes. It should. Do not set other retention settings. Just use the hours
  settings.
  Let me know about this :)
 
  Thanks,
 
  Mayuresh
 
  On Fri, Jul 24, 2015 at 6:43 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
  Mayuresh, thanks for your comment. I won't be able to change these
  settings
  until next Monday, but just so confirm you are saying that if I restart
  the
  brokers my logs should delete themselves with respect to the newest
  settings, correct?
  ᐧ
 
  On Fri, Jul 24, 2015 at 6:29 PM, Mayuresh Gharat 
  gharatmayures...@gmail.com
   wrote:
 
   No. This should not happen. At Linkedin we just use the log retention
   hours. Try using that. Chang e it and bounce the broker. It should
 work.
   Also looking back at the config's I am not sure why we had 3 different
   configs for the same property :
  
   log.retention.ms
   log.retention.minutes
   log.retention.hours
  
   We should probably be having just the milliseconds.
  
   Thanks,
  
   Mayuresh
  
   On Fri, Jul 24, 2015 at 4:12 PM, JIEFU GONG jg...@berkeley.edu
 wrote:
  
Hi all,
   
I have a few broad questions on how log deletion works, specifically
  in
conjunction with the log.retention.time setting. Say I published
 some
messages to some topics when the configuration was originally set to
something like log.retention.hours=168 (default). If I publish these
messages successfully, then later set the configuration to something
  like
log.retention.minutes=1, are those logs supposed to persist for the
   newest
settings or the old settings? Right now my logs are refusing to
 delete
themselves unless I specifically mark them for deletion -- is this
 the
correct/anticipated/wanted behavior?
   
Thanks for the help!
   
--
   
Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences
   
jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
   
  
  
  
   --
   -Regards,
   Mayuresh R. Gharat
   (862) 250-7125
  
 
 
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 
 
 
 
  --
  -Regards,
  Mayuresh R. Gharat
  (862) 250-7125
 



 --
 -Regards,
 Mayuresh R. Gharat
 (862) 250-7125




-- 
Grant Henke
Solutions Consultant | Cloudera
ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke