Re: HDFS buffer sizes

2014-01-23 Thread Arpit Agarwal
HDFS does not appear to use dfs.stream-buffer-size.


On Thu, Jan 23, 2014 at 6:57 AM, John Lilley wrote:

>  What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


RE: HDFS buffer sizes

2014-01-24 Thread John Lilley
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int 
IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagar...@hortonworks.com]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley 
mailto:john.lil...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and 
dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have 
experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


Re: HDFS buffer sizes

2014-01-24 Thread Arpit Agarwal
I don't think that value is used either except in the legacy block reader
which is turned off by default.


On Fri, Jan 24, 2014 at 6:34 AM, John Lilley wrote:

>  Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagar...@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley 
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


RE: HDFS buffer sizes

2014-01-25 Thread John Lilley
There is this in FileSystem.java, which would appear to use the default buffer 
size of 4096 in the create() call unless otherwise specified in 
io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
  Progressable progress) throws IOException {
return create(f, true,
  getConf().getInt(
  CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
  
CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
  replication,
  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any 
benefit to setting a larger bufferSize in FileSystem.create() and 
FileSystem.append()?

From: Arpit Agarwal [mailto:aagar...@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which 
is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley 
mailto:john.lil...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int 
IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal 
[mailto:aagar...@hortonworks.com<mailto:aagar...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley 
mailto:john.lil...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and 
dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have 
experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


Re: HDFS buffer sizes

2014-01-27 Thread Arpit Agarwal
Looks like DistributedFileSystem ignores it though.


On Sat, Jan 25, 2014 at 6:09 AM, John Lilley wrote:

>  There is this in FileSystem.java, which would appear to use the default
> buffer size of 4096 in the create() call unless otherwise specified in
> *io.file.buffer.size*
>
>
>
>   public FSDataOutputStream create(Path f, short replication,
>
>   Progressable progress) throws IOException {
>
> return create(f, true,
>
>   getConf().getInt(
>
>
> CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
>
>
> 
> CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
>
>   replication,
>
>   getDefaultBlockSize(f), progress);
>
>   }
>
>
>
> But this discussion is missing the point; I really want to know, is there
> any benefit to setting a larger bufferSize in FileSystem.create() and
> FileSystem.append()?
>
>
>
> *From:* Arpit Agarwal [mailto:aagar...@hortonworks.com]
> *Sent:* Friday, January 24, 2014 9:35 AM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> I don't think that value is used either except in the legacy block reader
> which is turned off by default.
>
>
>
> On Fri, Jan 24, 2014 at 6:34 AM, John Lilley 
> wrote:
>
> Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagar...@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley 
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


RE: HDFS buffer sizes

2014-02-09 Thread John Lilley
Thanks.  Experimentally, I have found that changing the buffers sizes has no 
effect, so that makes sense.
John

From: Arpit Agarwal [mailto:aagar...@hortonworks.com]
Sent: Tuesday, January 28, 2014 12:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

Looks like DistributedFileSystem ignores it though.

On Sat, Jan 25, 2014 at 6:09 AM, John Lilley 
mailto:john.lil...@redpoint.net>> wrote:
There is this in FileSystem.java, which would appear to use the default buffer 
size of 4096 in the create() call unless otherwise specified in 
io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
  Progressable progress) throws IOException {
return create(f, true,
  getConf().getInt(
  CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
  
CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
  replication,
  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any 
benefit to setting a larger bufferSize in FileSystem.create() and 
FileSystem.append()?

From: Arpit Agarwal 
[mailto:aagar...@hortonworks.com<mailto:aagar...@hortonworks.com>]
Sent: Friday, January 24, 2014 9:35 AM

To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which 
is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley 
mailto:john.lil...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int 
IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal 
[mailto:aagar...@hortonworks.com<mailto:aagar...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley 
mailto:john.lil...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and 
dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have 
experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.