Re: Review Request 30403: Patch for KAFKA-1906

2015-05-17 Thread Jaikiran Pai


 On Feb. 7, 2015, 4:41 p.m., Neha Narkhede wrote:
  config/server.properties, line 58
  https://reviews.apache.org/r/30403/diff/1/?file=839692#file839692line58
 
  Why leave the value here to point at /tmp/kafka-logs? The way I see is 
  that we default the log directory to data/ in the kafka installation 
  directory and possibly encourage storing kafka data under var/ if it must 
  be overridden for production, by changing this commented out value to 
  /var/kafka-logs?

This has now been addressed to leave out the reference to the /tmp folder.


- Jaikiran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review71557
---


On May 18, 2015, 4:42 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated May 18, 2015, 4:42 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 8c3fa286145344fb527307012c1d1000d855aa18 
   bin/windows/kafka-run-class.bat 4aa2ab8ddb9bf56c26fc1c292f359e50b40461e5 
   config/server.properties 80ee2fc6e94a114e7710ae4df3f4e2b83e06f080 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 2428dbd7197a58cf4cad42ef82b385dab3a2b15e 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-05-17 Thread Jaikiran Pai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/
---

(Updated May 18, 2015, 4:42 a.m.)


Review request for kafka.


Bugs: KAFKA-1906
https://issues.apache.org/jira/browse/KAFKA-1906


Repository: kafka


Description (updated)
---

KAFKA-1906 Default the Kafka data log directory to $KAFKA_HOME/data/kafka-logs 
directory, where KAFKA_HOME is the Kafka installation directory


Diffs (updated)
-

  bin/kafka-run-class.sh 8c3fa286145344fb527307012c1d1000d855aa18 
  bin/windows/kafka-run-class.bat 4aa2ab8ddb9bf56c26fc1c292f359e50b40461e5 
  config/server.properties 80ee2fc6e94a114e7710ae4df3f4e2b83e06f080 
  core/src/main/scala/kafka/server/KafkaConfig.scala 
9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
2428dbd7197a58cf4cad42ef82b385dab3a2b15e 

Diff: https://reviews.apache.org/r/30403/diff/


Testing
---

The change here involves updating the Kafka scripts (for Windows and * nix) to 
infer and setup KAFKA_HOME environment variable. This value is then used by the 
KafkaConfig to decide what path to default to for the Kafka data logs, in the 
absence of any explicitly set log.dirs (or log.dir) properties.

Care has been taken to ensure that other mechanism which might presently be 
bypassing the Kafka scripts, will still continue to function, since in the 
absence of KAFKA_HOME environment property value, we fall back to 
/tmp/kafka-logs (the present default) as the default data log directory

Existing tests have been run to ensure that this change maintains backward 
compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 new 
test methods have been added to the KafkaConfigTest to ensure that this change 
works.

Although the change has been made to both .sh and .bat files, to support this, 
I haven't actually tested this change on a Windows OS and would appreciate if 
someone can test this there and let me know if they run into any issues.


Thanks,

Jaikiran Pai



Re: Review Request 30403: Patch for KAFKA-1906

2015-05-17 Thread Jaikiran Pai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review84107
---


It took me a while to get back to contributing. I have now updated this patch 
to incorporate Neha's review comments. This is now ready to be reviewed again.

- Jaikiran Pai


On May 18, 2015, 4:42 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated May 18, 2015, 4:42 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 8c3fa286145344fb527307012c1d1000d855aa18 
   bin/windows/kafka-run-class.bat 4aa2ab8ddb9bf56c26fc1c292f359e50b40461e5 
   config/server.properties 80ee2fc6e94a114e7710ae4df3f4e2b83e06f080 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 2428dbd7197a58cf4cad42ef82b385dab3a2b15e 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-02-08 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review71557
---



config/server.properties
https://reviews.apache.org/r/30403/#comment117312

Why leave the value here to point at /tmp/kafka-logs? The way I see is that 
we default the log directory to data/ in the kafka installation directory and 
possibly encourage storing kafka data under var/ if it must be overridden for 
production, by changing this commented out value to /var/kafka-logs?



core/src/main/scala/kafka/server/KafkaConfig.scala
https://reviews.apache.org/r/30403/#comment117313

If KAFKA_HOME is not set (since somehow they are starting the kafka server 
using different scripts), then we should just point the log directory to the 
value of log.dirs and not force a default to /tmp.


- Neha Narkhede


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-02-02 Thread Neha Narkhede
+1 on including a reasonable default for log.dirs that points to data/ in
the installation directory. This is followed by some other systems and is
probably a little better compared to /tmp.

Also, I actually think we should default to production ready settings for
most configs, at least those related to performance. In most cases, this
works just fine and doesn't hurt the development environment much. Same
goes for GC settings. There is no value in shipping with a smaller heap or
non standard GC configs.

I ran into a bunch of this while writing docs for Kafka and thinking from
the user's perspective. And I think doing both of these things will improve
user experience.

Thanks,
Neha

On Sun, Feb 1, 2015 at 8:43 AM, Jay Kreps boredandr...@gmail.com wrote:

 Yeah I second Jaikiran's take: I think we should stick to something that
 works out of the box without editing config. I think the experience here
 is important, and adding one manual configuration step quickly cascades
 into a bunch of these steps (since it is always easier to make something
 manual than to think through how to implement a reasonable default).

 Of course when going to production you have to think, and I don't have a
 ton of sympathy for people who are doing production deploys with their data
 in /tmp, but I agree it would be better to make this a little safer even in
 that case if we can come up with a way to give a local data dir.

 Another approach I just thought of would be to just ship two configs: a
 developer.properties and production.properties each of which would have
 reasonable starting point configurations. This might actually be better
 since many of the settings we default to are a little anemic for a
 production install (e.g. default number of partitions, default replication
 factor, thread count, etc). I suspect this might solve a lot of support
 problems, actually, since most people don't think about that stuff.

 -Jay

 On Sat, Jan 31, 2015 at 7:27 AM, Jaikiran Pai jai.forums2...@gmail.com
 wrote:

   Hi Jeff,
 
  I guess you are :)
 
  Personally, whenever I download and try a new project *in development
  environment* I always just want to get it up and running without having
 to
  fiddle with configurations. Of course, I do a bit of reading the docs,
  before trying it out, but I like to have the install and run to be
  straightforward without having to change/add configurations. Having
  sensible defaults helps in development environments and in getting
 started.
  IMO, this param belongs to that category.
 
  -Jaikiran
 
 
  On Thursday 29 January 2015 08:00 PM, Jeff Holoman wrote:
 
  Maybe I'm in the minority here, but I actually don't think there should
 be
  a default for this param and you should be required to explicitly set
 this.
 
  On Thu, Jan 29, 2015 at 5:43 AM, Jay Kreps boredandr...@gmail.com
 wrote:
 
 
 
   On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
We added --override option to KafkaServer that allows overriding
  default configuration from commandline.
I believe that just changing the shell script to include --override
  log.dir=${KAFKA_HOME}/data
may be enough?
   
overriding configuration from server.properties in code can be very
  unintuitive.
  
   Jaikiran Pai wrote:
   That sounds a good idea. I wasn't aware of the --override option.
  I'll give that a try and if it works then the change will be limited to
  just the scripts.
  
   Jaikiran Pai wrote:
   Hi Gwen,
  
   I had a look at the JIRA
  https://issues.apache.org/jira/browse/KAFKA-1654 which added support
 for
  --override and also the purpose of that option. From what I see it
 won't be
  useful in this case, because in this current task, we don't want to
  override a value that has been explicitly set (via server.properties for
  example). Instead we want to handle a case where no explicit value is
  specified for the data log directory and in such cases default it to a
 path
  which resides under the Kafka install directory.
  
   If we use the --override option in our (startup) scripts to set
  log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the
  log.dir even when the user has intentionally specified a different path
 for
  the data logs.
  
   Let me know if I misunderstood your suggestion.
  
   Jay Kreps wrote:
   I think you are right that --override won't work but maybe this is
  still a good suggestion?
  
   Something seems odd about force overriding the working directory
 of
  the process just to set the log directory.
  
   Another option might be to add --default. This would work like
  --override but would provide a default value only if none is specified.
 I
  think this might be possible because the java.util.Properties we use for
  config supports a hierarchy of defaults. E.g. you can say new
  Properties(defaultProperties). Not sure if this is better or worse.
  
   Thoughts?
  
   Jaikiran Pai wrote:
   Hi Jay,
  
Another 

Re: Review Request 30403: Patch for KAFKA-1906

2015-02-02 Thread Gwen Shapira
While I appreciate production-grade configs a lot, I prefer to ship
with usable-out-of-the-box configs.

I suspect that large part of the amazing adoption that projects like
MySQL, MongoDB, Cassandra and Kafka had is how easy it is to install
and get going on a local dev box. Working as DB Architect, I could see
how new projects would always start with MySQL first. Getting MySQL on
local box is trivial, but no developer would willingly install Oracle
on his local box if he can help it.

I suspect that there's no one true production configuration, things
depend on hardware, environment, requirements, etc.

So, voting on the side of dev ready and not production ready

Gwen



On Mon, Feb 2, 2015 at 11:45 AM, Neha Narkhede n...@confluent.io wrote:
 +1 on including a reasonable default for log.dirs that points to data/ in
 the installation directory. This is followed by some other systems and is
 probably a little better compared to /tmp.

 Also, I actually think we should default to production ready settings for
 most configs, at least those related to performance. In most cases, this
 works just fine and doesn't hurt the development environment much. Same goes
 for GC settings. There is no value in shipping with a smaller heap or non
 standard GC configs.

 I ran into a bunch of this while writing docs for Kafka and thinking from
 the user's perspective. And I think doing both of these things will improve
 user experience.

 Thanks,
 Neha

 On Sun, Feb 1, 2015 at 8:43 AM, Jay Kreps boredandr...@gmail.com wrote:

 Yeah I second Jaikiran's take: I think we should stick to something that
 works out of the box without editing config. I think the experience here
 is important, and adding one manual configuration step quickly cascades
 into a bunch of these steps (since it is always easier to make something
 manual than to think through how to implement a reasonable default).

 Of course when going to production you have to think, and I don't have a
 ton of sympathy for people who are doing production deploys with their
 data
 in /tmp, but I agree it would be better to make this a little safer even
 in
 that case if we can come up with a way to give a local data dir.

 Another approach I just thought of would be to just ship two configs: a
 developer.properties and production.properties each of which would have
 reasonable starting point configurations. This might actually be better
 since many of the settings we default to are a little anemic for a
 production install (e.g. default number of partitions, default replication
 factor, thread count, etc). I suspect this might solve a lot of support
 problems, actually, since most people don't think about that stuff.

 -Jay

 On Sat, Jan 31, 2015 at 7:27 AM, Jaikiran Pai jai.forums2...@gmail.com
 wrote:

   Hi Jeff,
 
  I guess you are :)
 
  Personally, whenever I download and try a new project *in development
  environment* I always just want to get it up and running without having
  to
  fiddle with configurations. Of course, I do a bit of reading the docs,
  before trying it out, but I like to have the install and run to be
  straightforward without having to change/add configurations. Having
  sensible defaults helps in development environments and in getting
  started.
  IMO, this param belongs to that category.
 
  -Jaikiran
 
 
  On Thursday 29 January 2015 08:00 PM, Jeff Holoman wrote:
 
  Maybe I'm in the minority here, but I actually don't think there should
  be
  a default for this param and you should be required to explicitly set
  this.
 
  On Thu, Jan 29, 2015 at 5:43 AM, Jay Kreps boredandr...@gmail.com
  wrote:
 
 
 
   On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
We added --override option to KafkaServer that allows overriding
  default configuration from commandline.
I believe that just changing the shell script to include --override
  log.dir=${KAFKA_HOME}/data
may be enough?
   
overriding configuration from server.properties in code can be very
  unintuitive.
  
   Jaikiran Pai wrote:
   That sounds a good idea. I wasn't aware of the --override option.
  I'll give that a try and if it works then the change will be limited to
  just the scripts.
  
   Jaikiran Pai wrote:
   Hi Gwen,
  
   I had a look at the JIRA
  https://issues.apache.org/jira/browse/KAFKA-1654 which added support
  for
  --override and also the purpose of that option. From what I see it
  won't be
  useful in this case, because in this current task, we don't want to
  override a value that has been explicitly set (via server.properties
  for
  example). Instead we want to handle a case where no explicit value is
  specified for the data log directory and in such cases default it to a
  path
  which resides under the Kafka install directory.
  
   If we use the --override option in our (startup) scripts to set
  log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the
  log.dir even when the user has intentionally specified a 

Re: Review Request 30403: Patch for KAFKA-1906

2015-02-01 Thread Jeff Holoman
Yep I totally get that. But...you already have to modify the configs
anyway. The primary goals of default values are basically two-fold

1) Get started quickly
2) Provide reasonable values for stuff people don't really know about.

I guess I would just rather see people take the time (30s to 1min maybe?)
to consider where they want their storage location for Kafka to be. I don't
think that's an unreasonable expectation and will save people like me from
having to help people move stuff when they realize / has filled up.

At any rate, I think what's been suggested here is better than /tmp

Thanks

Jeff

On Sat, Jan 31, 2015 at 10:27 AM, Jaikiran Pai jai.forums2...@gmail.com
wrote:

  Hi Jeff,

 I guess you are :)

 Personally, whenever I download and try a new project *in development
 environment* I always just want to get it up and running without having to
 fiddle with configurations. Of course, I do a bit of reading the docs,
 before trying it out, but I like to have the install and run to be
 straightforward without having to change/add configurations. Having
 sensible defaults helps in development environments and in getting started.
 IMO, this param belongs to that category.

 -Jaikiran


 On Thursday 29 January 2015 08:00 PM, Jeff Holoman wrote:

 Maybe I'm in the minority here, but I actually don't think there should be
 a default for this param and you should be required to explicitly set this.

 On Thu, Jan 29, 2015 at 5:43 AM, Jay Kreps boredandr...@gmail.com wrote:



  On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
   We added --override option to KafkaServer that allows overriding
 default configuration from commandline.
   I believe that just changing the shell script to include --override
 log.dir=${KAFKA_HOME}/data
   may be enough?
  
   overriding configuration from server.properties in code can be very
 unintuitive.
 
  Jaikiran Pai wrote:
  That sounds a good idea. I wasn't aware of the --override option.
 I'll give that a try and if it works then the change will be limited to
 just the scripts.
 
  Jaikiran Pai wrote:
  Hi Gwen,
 
  I had a look at the JIRA
 https://issues.apache.org/jira/browse/KAFKA-1654 which added support for
 --override and also the purpose of that option. From what I see it won't be
 useful in this case, because in this current task, we don't want to
 override a value that has been explicitly set (via server.properties for
 example). Instead we want to handle a case where no explicit value is
 specified for the data log directory and in such cases default it to a path
 which resides under the Kafka install directory.
 
  If we use the --override option in our (startup) scripts to set
 log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the
 log.dir even when the user has intentionally specified a different path for
 the data logs.
 
  Let me know if I misunderstood your suggestion.
 
  Jay Kreps wrote:
  I think you are right that --override won't work but maybe this is
 still a good suggestion?
 
  Something seems odd about force overriding the working directory of
 the process just to set the log directory.
 
  Another option might be to add --default. This would work like
 --override but would provide a default value only if none is specified. I
 think this might be possible because the java.util.Properties we use for
 config supports a hierarchy of defaults. E.g. you can say new
 Properties(defaultProperties). Not sure if this is better or worse.
 
  Thoughts?
 
  Jaikiran Pai wrote:
  Hi Jay,
 
   Another option might be to add --default. This would work like
 --override but would provide a default value only if none is specified. I
 think this might be possible because the java.util.Properties we use for
 config supports a hierarchy of defaults. E.g. you can say new
 Properties(defaultProperties). Not sure if this is better or worse.
 
  I think --default sounds like a good idea which could help us use
 it for other properties too (if we need to). It does look better than the
 current change that I have done, because the Java code then doesn't have to
 worry about how that default value is sourced. We can then just update the
 scripts to set up the default for the log.dir appropriately.
 
  I can work towards adding support for it and will update this patch
 once it's ready.
 
  As for:
 
   Something seems odd about force overriding the working directory
 of the process just to set the log directory.
 
  Sorry, I don't understand what you meant there. Is it something
 about the change that was done to the scripts?

 I guess what I mean is: is there any other reason you might care about
 the working directory of the process? If so we probably don't want to force
 it to be the Kafka directory. If not it may actually be fine and in that
 case I think having relative paths work is nice. I don't personally know
 the answer to this, what is good practice for a server process?


 - Jay

Re: Review Request 30403: Patch for KAFKA-1906

2015-02-01 Thread Jay Kreps
Yeah I second Jaikiran's take: I think we should stick to something that
works out of the box without editing config. I think the experience here
is important, and adding one manual configuration step quickly cascades
into a bunch of these steps (since it is always easier to make something
manual than to think through how to implement a reasonable default).

Of course when going to production you have to think, and I don't have a
ton of sympathy for people who are doing production deploys with their data
in /tmp, but I agree it would be better to make this a little safer even in
that case if we can come up with a way to give a local data dir.

Another approach I just thought of would be to just ship two configs: a
developer.properties and production.properties each of which would have
reasonable starting point configurations. This might actually be better
since many of the settings we default to are a little anemic for a
production install (e.g. default number of partitions, default replication
factor, thread count, etc). I suspect this might solve a lot of support
problems, actually, since most people don't think about that stuff.

-Jay

On Sat, Jan 31, 2015 at 7:27 AM, Jaikiran Pai jai.forums2...@gmail.com
wrote:

  Hi Jeff,

 I guess you are :)

 Personally, whenever I download and try a new project *in development
 environment* I always just want to get it up and running without having to
 fiddle with configurations. Of course, I do a bit of reading the docs,
 before trying it out, but I like to have the install and run to be
 straightforward without having to change/add configurations. Having
 sensible defaults helps in development environments and in getting started.
 IMO, this param belongs to that category.

 -Jaikiran


 On Thursday 29 January 2015 08:00 PM, Jeff Holoman wrote:

 Maybe I'm in the minority here, but I actually don't think there should be
 a default for this param and you should be required to explicitly set this.

 On Thu, Jan 29, 2015 at 5:43 AM, Jay Kreps boredandr...@gmail.com wrote:



  On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
   We added --override option to KafkaServer that allows overriding
 default configuration from commandline.
   I believe that just changing the shell script to include --override
 log.dir=${KAFKA_HOME}/data
   may be enough?
  
   overriding configuration from server.properties in code can be very
 unintuitive.
 
  Jaikiran Pai wrote:
  That sounds a good idea. I wasn't aware of the --override option.
 I'll give that a try and if it works then the change will be limited to
 just the scripts.
 
  Jaikiran Pai wrote:
  Hi Gwen,
 
  I had a look at the JIRA
 https://issues.apache.org/jira/browse/KAFKA-1654 which added support for
 --override and also the purpose of that option. From what I see it won't be
 useful in this case, because in this current task, we don't want to
 override a value that has been explicitly set (via server.properties for
 example). Instead we want to handle a case where no explicit value is
 specified for the data log directory and in such cases default it to a path
 which resides under the Kafka install directory.
 
  If we use the --override option in our (startup) scripts to set
 log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the
 log.dir even when the user has intentionally specified a different path for
 the data logs.
 
  Let me know if I misunderstood your suggestion.
 
  Jay Kreps wrote:
  I think you are right that --override won't work but maybe this is
 still a good suggestion?
 
  Something seems odd about force overriding the working directory of
 the process just to set the log directory.
 
  Another option might be to add --default. This would work like
 --override but would provide a default value only if none is specified. I
 think this might be possible because the java.util.Properties we use for
 config supports a hierarchy of defaults. E.g. you can say new
 Properties(defaultProperties). Not sure if this is better or worse.
 
  Thoughts?
 
  Jaikiran Pai wrote:
  Hi Jay,
 
   Another option might be to add --default. This would work like
 --override but would provide a default value only if none is specified. I
 think this might be possible because the java.util.Properties we use for
 config supports a hierarchy of defaults. E.g. you can say new
 Properties(defaultProperties). Not sure if this is better or worse.
 
  I think --default sounds like a good idea which could help us use
 it for other properties too (if we need to). It does look better than the
 current change that I have done, because the Java code then doesn't have to
 worry about how that default value is sourced. We can then just update the
 scripts to set up the default for the log.dir appropriately.
 
  I can work towards adding support for it and will update this patch
 once it's ready.
 
  As for:
 
   Something seems odd about force overriding the working 

Re: Review Request 30403: Patch for KAFKA-1906

2015-01-31 Thread Jaikiran Pai

Hi Jeff,

I guess you are :)

Personally, whenever I download and try a new project *in development 
environment* I always just want to get it up and running without having 
to fiddle with configurations. Of course, I do a bit of reading the 
docs, before trying it out, but I like to have the install and run to be 
straightforward without having to change/add configurations. Having 
sensible defaults helps in development environments and in getting 
started. IMO, this param belongs to that category.


-Jaikiran

On Thursday 29 January 2015 08:00 PM, Jeff Holoman wrote:
Maybe I'm in the minority here, but I actually don't think there 
should be a default for this param and you should be required to 
explicitly set this.


On Thu, Jan 29, 2015 at 5:43 AM, Jay Kreps boredandr...@gmail.com 
mailto:boredandr...@gmail.com wrote:




 On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
  We added --override option to KafkaServer that allows overriding 
default configuration from
commandline.
  I believe that just changing the shell script to include
--override log.dir=${KAFKA_HOME}/data
  may be enough?
 
  overriding configuration from server.properties in code can be
very unintuitive.

 Jaikiran Pai wrote:
 That sounds a good idea. I wasn't aware of the --override
option. I'll give that a try and if it works then the change will
be limited to just the scripts.

 Jaikiran Pai wrote:
 Hi Gwen,

 I had a look at the JIRA
https://issues.apache.org/jira/browse/KAFKA-1654 which added
support for --override and also the purpose of that option. From
what I see it won't be useful in this case, because in this
current task, we don't want to override a value that has been
explicitly set (via server.properties for example). Instead we
want to handle a case where no explicit value is specified for the
data log directory and in such cases default it to a path which
resides under the Kafka install directory.

 If we use the --override option in our (startup) scripts to
set log.dir=${KAFKA_HOME}/data, we will end up forcing this value
as the log.dir even when the user has intentionally specified a
different path for the data logs.

 Let me know if I misunderstood your suggestion.

 Jay Kreps wrote:
 I think you are right that --override won't work but maybe
this is still a good suggestion?

 Something seems odd about force overriding the working
directory of the process just to set the log directory.

 Another option might be to add --default. This would work
like --override but would provide a default value only if none is
specified. I think this might be possible because the
java.util.Properties we use for config supports a hierarchy of
defaults. E.g. you can say new Properties(defaultProperties). Not
sure if this is better or worse.

 Thoughts?

 Jaikiran Pai wrote:
 Hi Jay,

  Another option might be to add --default. This would work
like --override but would provide a default value only if none is
specified. I think this might be possible because the
java.util.Properties we use for config supports a hierarchy of
defaults. E.g. you can say new Properties(defaultProperties). Not
sure if this is better or worse.

 I think --default sounds like a good idea which could help
us use it for other properties too (if we need to). It does look
better than the current change that I have done, because the Java
code then doesn't have to worry about how that default value is
sourced. We can then just update the scripts to set up the default
for the log.dir appropriately.

 I can work towards adding support for it and will update
this patch once it's ready.

 As for:

  Something seems odd about force overriding the working
directory of the process just to set the log directory.

 Sorry, I don't understand what you meant there. Is it
something about the change that was done to the scripts?

I guess what I mean is: is there any other reason you might care
about the working directory of the process? If so we probably
don't want to force it to be the Kafka directory. If not it may
actually be fine and in that case I think having relative paths
work is nice. I don't personally know the answer to this, what is
good practice for a server process?


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:

 ---
 This is an automatically 

Re: Review Request 30403: Patch for KAFKA-1906

2015-01-29 Thread Jaikiran Pai


 On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
  We added --override option to KafkaServer that allows overriding default 
  configuration from commandline.
  I believe that just changing the shell script to include --override 
  log.dir=${KAFKA_HOME}/data 
  may be enough?
  
  overriding configuration from server.properties in code can be very 
  unintuitive.
 
 Jaikiran Pai wrote:
 That sounds a good idea. I wasn't aware of the --override option. I'll 
 give that a try and if it works then the change will be limited to just the 
 scripts.
 
 Jaikiran Pai wrote:
 Hi Gwen,
 
 I had a look at the JIRA https://issues.apache.org/jira/browse/KAFKA-1654 
 which added support for --override and also the purpose of that option. From 
 what I see it won't be useful in this case, because in this current task, we 
 don't want to override a value that has been explicitly set (via 
 server.properties for example). Instead we want to handle a case where no 
 explicit value is specified for the data log directory and in such cases 
 default it to a path which resides under the Kafka install directory.
 
 If we use the --override option in our (startup) scripts to set 
 log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the log.dir 
 even when the user has intentionally specified a different path for the data 
 logs.
 
 Let me know if I misunderstood your suggestion.
 
 Jay Kreps wrote:
 I think you are right that --override won't work but maybe this is still 
 a good suggestion?
 
 Something seems odd about force overriding the working directory of the 
 process just to set the log directory.
 
 Another option might be to add --default. This would work like --override 
 but would provide a default value only if none is specified. I think this 
 might be possible because the java.util.Properties we use for config supports 
 a hierarchy of defaults. E.g. you can say new Properties(defaultProperties). 
 Not sure if this is better or worse.
 
 Thoughts?

Hi Jay,

 Another option might be to add --default. This would work like --override but 
 would provide a default value only if none is specified. I think this might 
 be possible because the java.util.Properties we use for config supports a 
 hierarchy of defaults. E.g. you can say new Properties(defaultProperties). 
 Not sure if this is better or worse.

I think --default sounds like a good idea which could help us use it for other 
properties too (if we need to). It does look better than the current change 
that I have done, because the Java code then doesn't have to worry about how 
that default value is sourced. We can then just update the scripts to set up 
the default for the log.dir appropriately.

I can work towards adding support for it and will update this patch once it's 
ready.

As for:

 Something seems odd about force overriding the working directory of the 
 process just to set the log directory.

Sorry, I don't understand what you meant there. Is it something about the 
change that was done to the scripts?


- Jaikiran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 

Re: Review Request 30403: Patch for KAFKA-1906

2015-01-29 Thread Jay Kreps


 On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
  We added --override option to KafkaServer that allows overriding default 
  configuration from commandline.
  I believe that just changing the shell script to include --override 
  log.dir=${KAFKA_HOME}/data 
  may be enough?
  
  overriding configuration from server.properties in code can be very 
  unintuitive.
 
 Jaikiran Pai wrote:
 That sounds a good idea. I wasn't aware of the --override option. I'll 
 give that a try and if it works then the change will be limited to just the 
 scripts.
 
 Jaikiran Pai wrote:
 Hi Gwen,
 
 I had a look at the JIRA https://issues.apache.org/jira/browse/KAFKA-1654 
 which added support for --override and also the purpose of that option. From 
 what I see it won't be useful in this case, because in this current task, we 
 don't want to override a value that has been explicitly set (via 
 server.properties for example). Instead we want to handle a case where no 
 explicit value is specified for the data log directory and in such cases 
 default it to a path which resides under the Kafka install directory.
 
 If we use the --override option in our (startup) scripts to set 
 log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the log.dir 
 even when the user has intentionally specified a different path for the data 
 logs.
 
 Let me know if I misunderstood your suggestion.
 
 Jay Kreps wrote:
 I think you are right that --override won't work but maybe this is still 
 a good suggestion?
 
 Something seems odd about force overriding the working directory of the 
 process just to set the log directory.
 
 Another option might be to add --default. This would work like --override 
 but would provide a default value only if none is specified. I think this 
 might be possible because the java.util.Properties we use for config supports 
 a hierarchy of defaults. E.g. you can say new Properties(defaultProperties). 
 Not sure if this is better or worse.
 
 Thoughts?
 
 Jaikiran Pai wrote:
 Hi Jay,
 
  Another option might be to add --default. This would work like 
 --override but would provide a default value only if none is specified. I 
 think this might be possible because the java.util.Properties we use for 
 config supports a hierarchy of defaults. E.g. you can say new 
 Properties(defaultProperties). Not sure if this is better or worse.
 
 I think --default sounds like a good idea which could help us use it for 
 other properties too (if we need to). It does look better than the current 
 change that I have done, because the Java code then doesn't have to worry 
 about how that default value is sourced. We can then just update the scripts 
 to set up the default for the log.dir appropriately.
 
 I can work towards adding support for it and will update this patch once 
 it's ready.
 
 As for:
 
  Something seems odd about force overriding the working directory of the 
 process just to set the log directory.
 
 Sorry, I don't understand what you meant there. Is it something about the 
 change that was done to the scripts?

I guess what I mean is: is there any other reason you might care about the 
working directory of the process? If so we probably don't want to force it to 
be the Kafka directory. If not it may actually be fine and in that case I think 
having relative paths work is nice. I don't personally know the answer to this, 
what is good practice for a server process?


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to 

Re: Review Request 30403: Patch for KAFKA-1906

2015-01-29 Thread Jay Kreps


 On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
  We added --override option to KafkaServer that allows overriding default 
  configuration from commandline.
  I believe that just changing the shell script to include --override 
  log.dir=${KAFKA_HOME}/data 
  may be enough?
  
  overriding configuration from server.properties in code can be very 
  unintuitive.
 
 Jaikiran Pai wrote:
 That sounds a good idea. I wasn't aware of the --override option. I'll 
 give that a try and if it works then the change will be limited to just the 
 scripts.
 
 Jaikiran Pai wrote:
 Hi Gwen,
 
 I had a look at the JIRA https://issues.apache.org/jira/browse/KAFKA-1654 
 which added support for --override and also the purpose of that option. From 
 what I see it won't be useful in this case, because in this current task, we 
 don't want to override a value that has been explicitly set (via 
 server.properties for example). Instead we want to handle a case where no 
 explicit value is specified for the data log directory and in such cases 
 default it to a path which resides under the Kafka install directory.
 
 If we use the --override option in our (startup) scripts to set 
 log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the log.dir 
 even when the user has intentionally specified a different path for the data 
 logs.
 
 Let me know if I misunderstood your suggestion.

I think you are right that --override won't work but maybe this is still a good 
suggestion?

Something seems odd about force overriding the working directory of the process 
just to set the log directory.

Another option might be to add --default. This would work like --override but 
would provide a default value only if none is specified. I think this might be 
possible because the java.util.Properties we use for config supports a 
hierarchy of defaults. E.g. you can say new Properties(defaultProperties). Not 
sure if this is better or worse.

Thoughts?


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-01-28 Thread Gwen Shapira

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


We added --override option to KafkaServer that allows overriding default 
configuration from commandline.
I believe that just changing the shell script to include --override 
log.dir=${KAFKA_HOME}/data 
may be enough?

overriding configuration from server.properties in code can be very unintuitive.

- Gwen Shapira


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-01-28 Thread Jaikiran Pai


 On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
  We added --override option to KafkaServer that allows overriding default 
  configuration from commandline.
  I believe that just changing the shell script to include --override 
  log.dir=${KAFKA_HOME}/data 
  may be enough?
  
  overriding configuration from server.properties in code can be very 
  unintuitive.

That sounds a good idea. I wasn't aware of the --override option. I'll give 
that a try and if it works then the change will be limited to just the scripts.


- Jaikiran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-01-28 Thread Jaikiran Pai


 On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
  We added --override option to KafkaServer that allows overriding default 
  configuration from commandline.
  I believe that just changing the shell script to include --override 
  log.dir=${KAFKA_HOME}/data 
  may be enough?
  
  overriding configuration from server.properties in code can be very 
  unintuitive.
 
 Jaikiran Pai wrote:
 That sounds a good idea. I wasn't aware of the --override option. I'll 
 give that a try and if it works then the change will be limited to just the 
 scripts.

Hi Gwen,

I had a look at the JIRA https://issues.apache.org/jira/browse/KAFKA-1654 which 
added support for --override and also the purpose of that option. From what I 
see it won't be useful in this case, because in this current task, we don't 
want to override a value that has been explicitly set (via server.properties 
for example). Instead we want to handle a case where no explicit value is 
specified for the data log directory and in such cases default it to a path 
which resides under the Kafka install directory.

If we use the --override option in our (startup) scripts to set 
log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the log.dir 
even when the user has intentionally specified a different path for the data 
logs.

Let me know if I misunderstood your suggestion.


- Jaikiran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70168
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-01-28 Thread Jaikiran Pai


 On Jan. 29, 2015, 6:32 a.m., Sriharsha Chintalapani wrote:
  core/src/main/scala/kafka/server/KafkaConfig.scala, line 149
  https://reviews.apache.org/r/30403/diff/1/?file=839693#file839693line149
 
  I am not sure all of these code changes necessary. Instead change the 
  default server.properties log.dirs=./tmp/kafka-logs

Hi Sriharsha,

The problem with doing that is, the path becomes relative to the current 
working directory from where the Kafka (startup) script was run. For example, 
if Kafka was installed at /opt/kafka and if someone started the Kafka server, 
while currently being in /home/me/ folder:

``` 
[/home/me]$ /opt/kafka/bin/kafka-server-start.sh 
/opt/kafka/config/server.properties
```
then the Kafka data logs will end up being in a folder relative to /home/me 
instead of the /opt/kafka installation path.


- Jaikiran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70163
---


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Re: Review Request 30403: Patch for KAFKA-1906

2015-01-28 Thread Sriharsha Chintalapani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/#review70163
---



core/src/main/scala/kafka/server/KafkaConfig.scala
https://reviews.apache.org/r/30403/#comment115201

I am not sure all of these code changes necessary. Instead change the 
default server.properties log.dirs=./tmp/kafka-logs


- Sriharsha Chintalapani


On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30403/
 ---
 
 (Updated Jan. 29, 2015, 6:24 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1906
 https://issues.apache.org/jira/browse/KAFKA-1906
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1906 Default the Kafka data log directory to 
 $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka 
 installation directory
 
 
 Diffs
 -
 
   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
   bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 6d74983472249eac808d361344c58cc2858ec971 
   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
 82dce80d553957d8b5776a9e140c346d4e07f766 
 
 Diff: https://reviews.apache.org/r/30403/diff/
 
 
 Testing
 ---
 
 The change here involves updating the Kafka scripts (for Windows and * nix) 
 to infer and setup KAFKA_HOME environment variable. This value is then used 
 by the KafkaConfig to decide what path to default to for the Kafka data logs, 
 in the absence of any explicitly set log.dirs (or log.dir) properties.
 
 Care has been taken to ensure that other mechanism which might presently be 
 bypassing the Kafka scripts, will still continue to function, since in the 
 absence of KAFKA_HOME environment property value, we fall back to 
 /tmp/kafka-logs (the present default) as the default data log directory
 
 Existing tests have been run to ensure that this change maintains backward 
 compatibility (i.e. doesn't fail when KAFKA_HOME isn't available/set) and 2 
 new test methods have been added to the KafkaConfigTest to ensure that this 
 change works.
 
 Although the change has been made to both .sh and .bat files, to support 
 this, I haven't actually tested this change on a Windows OS and would 
 appreciate if someone can test this there and let me know if they run into 
 any issues.
 
 
 Thanks,
 
 Jaikiran Pai
 




Review Request 30403: Patch for KAFKA-1906

2015-01-28 Thread Jaikiran Pai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30403/
---

Review request for kafka.


Bugs: KAFKA-1906
https://issues.apache.org/jira/browse/KAFKA-1906


Repository: kafka


Description
---

KAFKA-1906 Default the Kafka data log directory to $KAFKA_HOME/data/kafka-logs 
directory, where KAFKA_HOME is the Kafka installation directory


Diffs
-

  bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e 
  bin/windows/kafka-run-class.bat 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2 
  config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5 
  core/src/main/scala/kafka/server/KafkaConfig.scala 
6d74983472249eac808d361344c58cc2858ec971 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
82dce80d553957d8b5776a9e140c346d4e07f766 

Diff: https://reviews.apache.org/r/30403/diff/


Testing
---


Thanks,

Jaikiran Pai