Re: Policy on supporting old and external software packages and compat RPMS

2022-12-07 Thread Terry Barnaby

On 06/12/2022 16:37, Kevin Kofler via devel wrote:

Terry Barnaby wrote:

Well in this case I have created a suitable compat lib, all I did was
re-introduce the bits to the SPEC file that removed the building of the
compat lib and we are fine. I haven't separated it out from the main
ncurses SPEC through and have only done this locally as I have no
knowledge of the hoops to create a separate package and that seems like
the wrong way to do this in general. I have made this available to
others who will be in the same boat.

Typically, compatibility libraries should not be subpackages of the main
library. But ncurses is a bit peculiar in that, as I understand it, the
latest code can still be built with the old ABI. So in that case, it at
least makes sense to build both from the same SRPM. But only if they are
going to be maintained by the same maintainer(s), i.e., you should probably
sign up as a comaintainer for ncurses in Fedora if you want to do it that
way. And of course only as long as upstream continues supporting building
for the old ABI. If they drop support, then a separate compatibility package
with an old version that supports the old ABI will be needed.

 Kevin Kofler
___


Well I don't want to rock the boat with the maintainer, they are just 
doing what they think is expected.


I will just continue with my own local RPM for our own uses and provide 
it for anyone else that is in the same boat as we are.


Terry
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: Policy on supporting old and external software packages and compat RPMS

2022-12-07 Thread Terry Barnaby

On 06/12/2022 17:40, Vít Ondruch wrote:


Dne 06. 12. 22 v 17:09 Terry Barnaby napsal(a):

On 06/12/2022 15:56, Vít Ondruch wrote:


Dne 06. 12. 22 v 16:44 Terry Barnaby napsal(a):

On 06/12/2022 10:40, Dominik 'Rathann' Mierzejewski wrote:

On Tuesday, 06 December 2022 at 07:43, Terry Barnaby wrote:
[...]

My view is that compat versions of the commonly used shared libraries
for programs that are used on Redhat7 should be kept available until
most people are not producing programs for that system at least
+nyears and then I guess Redhat8 once that really becomes a core base
platform that external people use. A core list of these (there are
only a few) could be kept somewhere and when one is to be 
depreciated,

or users see problems when Fedora is updated,  a decision on this can
be then made with that info. This would keep the Fedora system
relevant for more users needs without too much work.
Well, that is still *some* work and someone would have to do it. 
Are you

volunteering?


In the case of ncurses, it is really just putting back into the SPEC
file that which was removed for F37 plus the extra storage on mirrors
for the compat RPM's.

If it's "just" that, why don't you do it yourself? Obviously, the
current ncurses maintainer decided it was time to drop the old v5 ABI
compat libs from the package. However, nothing is stopping you from
picking that up and maintaining an "ncurses5" package for as long 
as you

need it.

Regards,
Dominik


Well in this case I have created a suitable compat lib, all I did 
was re-introduce the bits to the SPEC file that removed the 
building of the compat lib and we are fine. I haven't separated it 
out from the main ncurses SPEC through and have only done this 
locally as I have no knowledge of the hoops to create a separate 
package and that seems like the wrong way to do this in general. I 
have made this available to others who will be in the same boat.


But the purpose of my thread here is a more general Fedora policy 
question as it affects users of Fedora as to what applications the 
OS is likely to support. If the policy is to just support Fedora 
built binaries within the particular version of Fedora and to 
ignore external and commercial binaries built with say Redhat7 and 
provide no degree of compatibility so be it, but it would be useful 
for everyone to know where the land lies and be on the same page. 
The maintainer of the ncurses package wasn't sure of the policy on 
this.



I wonder what is this claim based upon? Could you provide a link? 
And I wonder if there was such policy, would the maintainer changed 
their mind? I am asking because I don't think that any policy can be 
enforced unless there is somebody to pickup the work. So in this 
case, it should be enough to convince the maintainer to revert the 
changes, shouldn't be?



Vít




Sorry, what claim ?



My question was specifically about the last sentence of the quote, 
e.g. do you have any link to BZ ticket, ML or anywhere else where this 
was discussed? That would help to get complete picture.





I have no issue with the maintainer in this particular instance, as 
people have said if a maintainer doesn't want to support something 
that is fine. And obviously a Policy may not be enforced, but at 
least it would be a guideline. I understand the maintainer asked a 
question on when to depreciate on this list, but had no replies.



Ah, so there was ML thread you are referring to. Is it this one?

https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/Z3U6CAM4YVI2Y62QNQIHHHLPD7QEXBBV/#Z3U6CAM4YVI2Y62QNQIHHHLPD7QEXBBV 



And this could also provide more context:

https://bugzilla.redhat.com/show_bug.cgi?id=2150117


He took it that if the compat libs were not in use by any Fedora 
packages for the release in question it should/could be depreciated 
which is one approach.



It seems that Miroslav would consider the change if there was such 
guidelines.


So do you have any specific gudelines draft?

I think this is always the same, if you really want Fedora to do 
something, then it might need to take some action. Sometimes is enough 
to open ticket or send email, other times it means to maintain package 
or draft specific proposal for guidelines. You can even join the 
Fedora governing bodies if you think it will help.


But I still think the best option for you and for the whole Fedora 
community would be if you picked up the compat-ncurses package 
maintenance.


I added some ideas for guidelines in a recent list email. But I suspect 
most would not approve :)


Ok, I can look at that, but in this particular instance its seems to be 
a waste of extra effort, extra packages with extra SPEC's and extra 
build time as it is all there in the original SPEC file and this was 
depreciated as it was considered that this is the policy when a compat 
package was not used by any Fedora pure packages.






Vít


_

Re: Policy on supporting old and external software packages and compat RPMS

2022-12-07 Thread Terry Barnaby

On 06/12/2022 20:21, Josh Boyer wrote:

On Tue, Dec 6, 2022 at 2:01 PM Stephen Smoogen  wrote:



On Tue, 6 Dec 2022 at 13:50, Josh Boyer  wrote:

On Tue, Dec 6, 2022 at 11:27 AM Neal Gompa  wrote:

On Tue, Dec 6, 2022 at 7:54 AM Josh Boyer  wrote:

On Tue, Dec 6, 2022 at 1:43 AM Terry Barnaby  wrote:

On 05/12/2022 16:00, Jarek Prokop wrote:


On 12/5/22 14:57, Peter Robinson wrote:

On Mon, Dec 5, 2022 at 12:01 PM Vitaly Zaitsev via devel
 wrote:

I wouldn't expect them to build for a Fedora version.  I also wouldn't
expect ISV software built against Red Hat Enterprise Linux 7 (or 8) to
work on Fedora either.


As a practical matter, I generally *do* expect them to be compatible
at some level. RHEL is a derivative of Fedora. Otherwise it gets very
difficult to use commercial software on a Fedora system. I know plenty
of ISVs that have a similar expectation.

That compatibility degrades over time though.  At this point in time,
with RHEL 7 being almost 9 years old, I would not expect software
built on RHEL 7 to work on any supported Fedora version.  If it does
work, that's fantastic and a testament to Fedora, but people should
not have that expectation.  Terry is politely asking for a policy that
would set that expectation.  I think the intention is good, but I
don't believe it to be realistic.


I think he would be happy with the policy spelled out in any form. Something 
like:

While the Fedora Project is the upstream of CentOS Stream and Red Hat 
Enterprise Linux, it does not give any guarantees of its releases being 
compatible with either. Software built in either may not work due to missing 
dependencies, changes in kernel, compile time or library options, or similar 
issues.

Ah!  Yes, making that clear would be good.


As a guideline that sound a bit woolly to me and doesn't sound useful to 
maintainers. As an rough idea either:


While the Fedora Project is the upstream of CentOS Stream and Red Hat 
Enterprise Linux it does not attempt to provide any compatibility with any 
major current and past versions of Red Hat Enterprise Linux (currently 7, 8, 9) 
or any other Linux distribution. Software binaries built for these generic 
systems may or may not work.

or

The Fedora project attempts to provide a small degree of binary program 
compatibility by means of compat libraries for the major current and past 
versions of Red Hat Enterprise Linux (currently 7, 8, 9) and the past 2 
releases of Fedora only for reasonably some well used (by means of user feed 
back) external/commercial applications for 2 years after their publication date 
where this is easy of achieve as simple compat shared library additions and the 
maintainers of the required packages are willing to provide such packages.


That's probably a bit much for some, but some watered down derivative :)
Having a degree of binary compatibility aids external/commercial producers and 
makes Fedora more useful to more people.
Just my view.

Actually is there some mechanism that Fedora could work out how many are using 
compat RPM's ?
I guess this would require some system used by mirrors that would report back 
number of downloads of each package. Obviously this wouldn't get everything (we 
have a cache of packages that we user across systems to reduce downloads across 
the Internet), but might give some metrics to automate such things.



josh


To perhaps illustrate the point further, Red Hat Enterprise Linux does
not support applications built on version X-1 running on X unless it
is constrained to using a very very small set of dependencies (glibc,
libgcc/libstdc++, and a few smaller libraries).  Again, it may work
fine but the expectation and support policies set for RHEL are
(simplified) build on X, run on X where X is within a major version.
Our full documentation on this is available in the Application
Compatibility Guides.

josh
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue



--
Stephen Smoogen, Red Hat Automotive
Let us be kind to one another, for most of us are fighting a hard battle. -- 
Ian MacClaren
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora

Re: Policy on supporting old and external software packages and compat RPMS

2022-12-06 Thread Terry Barnaby

On 06/12/2022 15:56, Vít Ondruch wrote:


Dne 06. 12. 22 v 16:44 Terry Barnaby napsal(a):

On 06/12/2022 10:40, Dominik 'Rathann' Mierzejewski wrote:

On Tuesday, 06 December 2022 at 07:43, Terry Barnaby wrote:
[...]

My view is that compat versions of the commonly used shared libraries
for programs that are used on Redhat7 should be kept available until
most people are not producing programs for that system at least
+nyears and then I guess Redhat8 once that really becomes a core base
platform that external people use. A core list of these (there are
only a few) could be kept somewhere and when one is to be depreciated,
or users see problems when Fedora is updated,  a decision on this can
be then made with that info. This would keep the Fedora system
relevant for more users needs without too much work.
Well, that is still *some* work and someone would have to do it. Are 
you

volunteering?


In the case of ncurses, it is really just putting back into the SPEC
file that which was removed for F37 plus the extra storage on mirrors
for the compat RPM's.

If it's "just" that, why don't you do it yourself? Obviously, the
current ncurses maintainer decided it was time to drop the old v5 ABI
compat libs from the package. However, nothing is stopping you from
picking that up and maintaining an "ncurses5" package for as long as 
you

need it.

Regards,
Dominik


Well in this case I have created a suitable compat lib, all I did was 
re-introduce the bits to the SPEC file that removed the building of 
the compat lib and we are fine. I haven't separated it out from the 
main ncurses SPEC through and have only done this locally as I have 
no knowledge of the hoops to create a separate package and that seems 
like the wrong way to do this in general. I have made this available 
to others who will be in the same boat.


But the purpose of my thread here is a more general Fedora policy 
question as it affects users of Fedora as to what applications the OS 
is likely to support. If the policy is to just support Fedora built 
binaries within the particular version of Fedora and to ignore 
external and commercial binaries built with say Redhat7 and provide 
no degree of compatibility so be it, but it would be useful for 
everyone to know where the land lies and be on the same page. The 
maintainer of the ncurses package wasn't sure of the policy on this.



I wonder what is this claim based upon? Could you provide a link? And 
I wonder if there was such policy, would the maintainer changed their 
mind? I am asking because I don't think that any policy can be 
enforced unless there is somebody to pickup the work. So in this case, 
it should be enough to convince the maintainer to revert the changes, 
shouldn't be?



Vít




Sorry, what claim ?

I have no issue with the maintainer in this particular instance, as 
people have said if a maintainer doesn't want to support something that 
is fine. And obviously a Policy may not be enforced, but at least it 
would be a guideline. I understand the maintainer asked a question on 
when to depreciate on this list, but had no replies. He took it that if 
the compat libs were not in use by any Fedora packages for the release 
in question it should/could be depreciated which is one approach.


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: Policy on supporting old and external software packages and compat RPMS

2022-12-06 Thread Terry Barnaby

On 06/12/2022 10:40, Dominik 'Rathann' Mierzejewski wrote:

On Tuesday, 06 December 2022 at 07:43, Terry Barnaby wrote:
[...]

My view is that compat versions of the commonly used shared libraries
for programs that are used on Redhat7 should be kept available until
most people are not producing programs for that system at least
+nyears and then I guess Redhat8 once that really becomes a core base
platform that external people use. A core list of these (there are
only a few) could be kept somewhere and when one is to be depreciated,
or users see problems when Fedora is updated,  a decision on this can
be then made with that info. This would keep the Fedora system
relevant for more users needs without too much work.

Well, that is still *some* work and someone would have to do it. Are you
volunteering?


In the case of ncurses, it is really just putting back into the SPEC
file that which was removed for F37 plus the extra storage on mirrors
for the compat RPM's.

If it's "just" that, why don't you do it yourself? Obviously, the
current ncurses maintainer decided it was time to drop the old v5 ABI
compat libs from the package. However, nothing is stopping you from
picking that up and maintaining an "ncurses5" package for as long as you
need it.

Regards,
Dominik


Well in this case I have created a suitable compat lib, all I did was 
re-introduce the bits to the SPEC file that removed the building of the 
compat lib and we are fine. I haven't separated it out from the main 
ncurses SPEC through and have only done this locally as I have no 
knowledge of the hoops to create a separate package and that seems like 
the wrong way to do this in general. I have made this available to 
others who will be in the same boat.


But the purpose of my thread here is a more general Fedora policy 
question as it affects users of Fedora as to what applications the OS is 
likely to support. If the policy is to just support Fedora built 
binaries within the particular version of Fedora and to ignore external 
and commercial binaries built with say Redhat7 and provide no degree of 
compatibility so be it, but it would be useful for everyone to know 
where the land lies and be on the same page. The maintainer of the 
ncurses package wasn't sure of the policy on this. In my case, a lack of 
being able to easily run particular external/commercial programs built 
for Redhat7 will likely move me further away from working with Fedora 
(using, reporting bugs, promoting it etc.) as it will be a less useful 
system for our usage.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: Policy on supporting old and external software packages and compat RPMS

2022-12-05 Thread Terry Barnaby

On 05/12/2022 16:00, Jarek Prokop wrote:



On 12/5/22 14:57, Peter Robinson wrote:

On Mon, Dec 5, 2022 at 12:01 PM Vitaly Zaitsev via devel
  wrote:

On 05/12/2022 12:39, Terry Barnaby wrote:

I am wondering what Fedora's policy is on depreciated old shared
libraries and particularly compat RPM's ?

Fedora is a bleeding edge distribution. If you need old versions, you
should try CentOS or RHEL.

Being leading edge doesn't mean those usecases aren't relevant, one is
not mutually exclusive to the other, especially when it comes to
things like FPGAs etc.
We still have myriad of VM orchestrating solutions (libvirt, vagrant, 
gnome-boxes, and probably others I forgot).
There shouldn't be a problem spinning up a graphical environment of 
CentOS 7, getting EPEL and then using the tool.


Maybe the tool would work using the `toolbox` utility using last known 
good Fedora version for the tool.

That is just my wild guess however.

This is sometimes the tax for being "too" modern.
If the vendor does not want to support Fedora, we can't be held 
accountable to fully support their solution.
Does the software work? Yes? That is great! If not, well… we can't do 
much without the source code under nice FOSS license, can we.


Regards,
Jarek

Although yes, there are things like VM's, containers etc. which we use 
for old development environments all of these are, IMO, clumsy and 
awkward to use and difficult to manage especially within automated build 
environments that build the complete code for an embedded system with 
various CPU's, FPGA's, other tools etc.


I know Fedora is fairly bleeding edge (really too bleeding edge for our 
uses, but others are too far behind), but there is obviously going to be 
a balance here so that Fedora is still useful to as many people as 
reasonably possible, hence the question on a policy.


In the particular case I am talking about, libncurses*5.so, this is a 
fairly common shared library used by quite a few command line tools. A 
lot of external/commercial programs are built on/for Redhat7 as it is a 
de-facto base Linux platform and programs built on it will likely work 
on many other Linux systems. These companies are not going to build for 
a version of Fedora, it changes far to fast and would require large 
amounts or development/support work because of this. Some of the tools I 
am using were built/shipped in Feburary 2022, so we are not talking 
about old tools here.


My view is that compat versions of the commonly used shared libraries 
for programs that are used on Redhat7 should be kept available until 
most people are not producing programs for that system at least +nyears 
and then I guess Redhat8 once that really becomes a core base platform 
that external people use. A core list of these (there are only a few) 
could be kept somewhere and when one is to be depreciated, or users see 
problems when Fedora is updated,  a decision on this can be then made 
with that info. This would keep the Fedora system relevant for more 
users needs without too much work. In the case of ncurses, it is really 
just putting back into the SPEC file that which was removed for F37 plus 
the extra storage on mirrors for the compat RPM's.





___
devel mailing list --devel@lists.fedoraproject.org
To unsubscribe send an email todevel-le...@lists.fedoraproject.org
Fedora Code of 
Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
List 
Archives:https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report 
it:https://pagure.io/fedora-infrastructure/new_issue


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Policy on supporting old and external software packages and compat RPMS

2022-12-05 Thread Terry Barnaby

Hi,

With the latest release of Fedora37 we were hit with an issue where the 
ncurses-compat-libs RPM had been depreciated. Due to this some of the 
tools we use would no longer install from their respective RPM's or 
their tar based installs would not run as they needed the 
libncurses*5.so shared libraries.


We use a number of software packages for Electronics and Software 
development, some of which are developed by organisations and companies 
outside of Fedora. This includes things like ARM GCC compilers, FPGA 
compilers, PCB tools, manufacturers utilities etc. Many of these are 
built for Redhat7, being a good general and stable base Linux system for 
these companies/organisations to target. There are a few shared 
libraries that are commonly used and ncurses is one of these.


I am wondering what Fedora's policy is on depreciated old shared 
libraries and particularly compat RPM's ?


If there isn't one, for the benefit of users and making Fedora OS more 
generally useful, can I suggest that relatively often used compat RPMS 
are kept available at least while a major base system such as Redhat7 is 
still widely used as a build platform for external 
companies/organisations and/or perhaps for at least 15? years (or some 
defined time) after they become compat RPM's ?


Terry
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-15 Thread Terry Barnaby

On 15/02/18 16:48, J. Bruce Fields wrote:

On Tue, Feb 13, 2018 at 07:01:22AM +, Terry Barnaby wrote:

The transaction system allows the write delegation to send the data to the
servers RAM without the overhead of synchronous writes to the disk.

As far as I'm concerned this problem is already solved--did you miss the
discussion of WRITE/COMMIT in other email?

The problem you're running into is with metadata (file creates) more
than data.
Not quite, I think, unless I missed something ? With the transaction 
method on top of write delegations the NFS server is always effectively 
in "async" mode so actual disk writes are asynchronous with the final 
real NFS WRITE's from the write delegation and thus the disk writes can 
be optimised by the OS as normal, as and when, without the data being 
lost if the server dies. So no requirement for the bottleneck of server 
side "sync" mode and special disk systems unless needed for a particular 
requirement. As this method protects the data, even the metadata can be 
passed through asynchronously apart from the original open, assuming the 
particular requirements are such that other clients don't need access to 
this metadata live. As far as I can see, with only quick thoughts so may 
be missing something major (!),  this method could almost fully remove 
latency issues with NFS writes with small files while using conventional 
disk systems in the server.
As someone suggested that this is not the right place for this sort of 
discussion, I will try and start a discussion on the NFS kernel mailing 
when I have some time. It would be good to see an improved performance 
with NFS writes with small files while still being secure and also find 
out why fsync() does not work when the NFS export is in "async" mode!



PS: I have some RPC latency figures for some other NFS servers at work. The
NFS RPC latency on some of them is nearer the ICMP ping times, ie about
100us. Maybe quite a bit of CPU is needed to respond to an NFS RPC call
these days. The 500us RPC time was on a oldish home server using an Intel(R)
Core(TM)2 CPU 6300 @ 1.86GHz.

Tracing to figure ou the source of the latency might still be
interesting.

Will see if I can find some time to do this, away for a bit at the moment.


--b.


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-12 Thread Terry Barnaby

On 12/02/18 22:14, J. Bruce Fields wrote:

On Mon, Feb 12, 2018 at 08:12:58PM +, Terry Barnaby wrote:

On 12/02/18 17:35, Terry Barnaby wrote:

On 12/02/18 17:15, J. Bruce Fields wrote:

On Mon, Feb 12, 2018 at 05:09:32PM +, Terry Barnaby wrote:

One thing on this, that I forgot to ask, doesn't fsync() work
properly with
an NFS server side async mount then ?

No.

If a server sets "async" on an export, there is absolutely no way for a
client to guarantee that data reaches disk, or to know when it happens.

Possibly "ignore_sync", or "unsafe_sync", or something else, would be a
better name.

...

Just tried the use of fsync() with an NFS async mount, it appears to work.

That's expected, it's the *export* option that cheats, not the mount
option.

Also, even if you're using the async export option--fsync will still
flush data to server memory, just not necessarily to disk.


With a simple 'C' program as a test program I see the following data
rates/times when the program writes 100 MBytes to a single file over NFS
(open, write, write .., fsync) followed by close (after the timing):

NFS Write multiple small files 0.001584 ms/per file 0.615829 MBytes/sec
CpuUsage: 3.2%
Disktest: Writing/Reading 100.00 MBytes in 1048576 Byte Chunks
Disk Write sequential data rate fsync: 1 107.250685 MBytes/sec CpuUsage:
13.4%
Disk Write sequential data rate fsync: 0 4758.953878 MBytes/sec CpuUsage:
66.7%

Without the fsync() call the data rate is obviously to buffers and with the
fsync() call it definitely looks like it is to disk.

Could be, or you could be network-limited, hard to tell without knowing
more.


Interestingly, it appears, that the close() call actually does an effective
fsync() as well as the close() takes an age when fsync() is not used.

Yes: http://nfs.sourceforge.net/#faq_a8

--b.


Quite right, it was network limited (disk vs network speed is about the 
same). Using a slower USB stick disk shows that fsync() is not working 
with a NFSv4 "async" export.


But why is this ? It just doesn't make sense to me that fsync() should 
work this way even with an NFS "async" export ? Why shouldn't it do the 
right thing "synchronize a file's in-core state with storage device" (I 
don't consider an NFS server a storage device only the non volatile 
devices it uses). It seems it would be easy to flush the clients write 
buffer to the NFS server (as it does now) and then perform the fsync() 
on the server for the file in question. What am I missing ?



Thinking out loud (and without a great deal of thought), on removing the 
NFS export "async" option, improving write small files performance and 
keeping data security it seems to me one method might be:


1. NFS server is always in "async" export mode (Client can mount in sync 
mode if wanted). Data and metadata (optionally) is buffered in RAM on 
client and server.


2. Client fsync() works all the way to disk on the server.

3. Client sync() does an fsync() of each open for write NFS file. (Maybe 
this will be too much load on NFS servers ...)


4. You implement NFSv4 write delegations :)

5. There is a transaction based system for file writes:

5.1 When a file is opened for write, a transaction is created (id). This 
is sent with the OPEN call.


5.2 Further file operations including SETATTR, WRITE are allocated as 
stages in this transaction (id.stage) and are just buffered in the 
client (no direct server RPC calls).


5.3 The client sends the NFS operations for this write, as and when, 
optimised into full sized network packets to the server. But the data 
and metadata are kept buffered in the client.


5.4 The server stores the data in its normal FS RAM buffers during the 
NFS RPC calls.


5.5 When the server actually writes the data to disk (using its normal 
optimised disk writing system for the file system and device in 
question), the transaction and stage (id.stage) are returned to the 
client (within an NFS reply). The client can now release the buffers up 
to this stage in the transaction.


The transaction system allows the write delegation to send the data to 
the servers RAM without the overhead of synchronous writes to the disk.


It does mean the data is stored in RAM in both the client and server at 
the same time (twice as much RAM usage). Not sure how easy it would be 
to implement in the Linux kernel (NFS informed on FS buffer free ?) and 
would require NFS protocol extensions for the transactions.


With this method the client can resend the data on a server fail/reboot 
and the data can be ensured to be on the disk after an fsync(), sync() 
(within reason!). It should offer the fastest write performance and 
should eliminate the untar performance issue with small file 
creation/writes and still be relatively secure with data if the server 
dies. Unless I am missing something ?



PS: I have some RPC latency figures for some other NFS servers at work. 
Th

Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-12 Thread Terry Barnaby

On 12/02/18 17:35, Terry Barnaby wrote:

On 12/02/18 17:15, J. Bruce Fields wrote:

On Mon, Feb 12, 2018 at 05:09:32PM +, Terry Barnaby wrote:
One thing on this, that I forgot to ask, doesn't fsync() work 
properly with

an NFS server side async mount then ?

No.

If a server sets "async" on an export, there is absolutely no way for a
client to guarantee that data reaches disk, or to know when it happens.

Possibly "ignore_sync", or "unsafe_sync", or something else, would be a
better name.

--b.


Well that seems like a major drop off, I always thought that fsync() 
would work in this case. I don't understand why fsync() should not 
operate as intended ? Sounds like this NFS async thing needs some work !



I still do not understand why NFS doesn't operate in the same way as a 
standard mount on this. The use for async is only for improved 
performance due to disk write latency and speed (or are there other 
reasons ?)


So with a local system mount:

async: normal mode: All system calls manipulate in buffer memory disk 
structure (inodes etc). Data/Metadata is flushed to disk on fsync(), 
sync() and occasionally by kernel. Processes data is not actually 
stored until fsync(), sync() etc.


sync: with sync option. Data/metadata is written to disk before system 
calls return (all FS system calls ?).



With an NFS mount I would have thought it should be the same.

async: normal mode: All system calls manipulate in buffer memory disk 
structure (inodes etc) this would normally be on the server (so 
multiple clients can work with the same data) but with some options 
(particular usage) maybe client side write buffering/caching could be 
used (ie. data would not actually pass to server during every FS 
system call). Data/Metadata is flushed to server disk on fsync(), 
sync() and occasionally by kernel (If client side write caching is 
used flushes across network and then flushes server buffers). 
Processes data is not actually stored until fsync(), sync() etc.


sync: with client side sync option. Data/metadata is written across 
NFS and to Server disk before system calls return (all FS system calls 
?).


I really don't understand why the async option is implemented on the 
server export although a sync option here could force sync for all 
clients for that mount. What am I missing ? Is there some good reason 
(rather than history) it is done this way ?


Just tried the use of fsync() with an NFS async mount, it appears to 
work. With a simple 'C' program as a test program I see the following 
data rates/times when the program writes 100 MBytes to a single file 
over NFS (open, write, write .., fsync) followed by close (after the 
timing):


NFS Write multiple small files 0.001584 ms/per file 0.615829 MBytes/sec 
CpuUsage: 3.2%

Disktest: Writing/Reading 100.00 MBytes in 1048576 Byte Chunks
Disk Write sequential data rate fsync: 1 107.250685 MBytes/sec CpuUsage: 
13.4%
Disk Write sequential data rate fsync: 0 4758.953878 MBytes/sec 
CpuUsage: 66.7%


Without the fsync() call the data rate is obviously to buffers and with 
the fsync() call it definitely looks like it is to disk.


Interestingly, it appears, that the close() call actually does an 
effective fsync() as well as the close() takes an age when fsync() is 
not used.


(By the way just go bitten by a Fedora27 KDE/plasma/NetworkManager 
change that sets the Ethernet interfaces of all my systems to 100 
MBits/s half duplex. Looks like the ability to configure Ethernet auto 
negotiation has been added and the default is fixed 100 MBits/s half 
duplex !)


Basic test code (just the write function):

void nfsPerfWrite(int doFsync){
    int        f;
    char        buf[bufSize];
    int        n;
    double        st, et, r;
    int        nb;
    int        numBuf;
    CpuStat        cpuStatStart;
    CpuStat        cpuStatEnd;
    double        cpuUsed;
    double        cpuUsage;

    sync();
    f = open64(fileName, O_RDWR | O_CREAT, 0666);
    if(f < 0){
        fprintf(stderr, "Error creating %s: %s\n", fileName, 
strerror(errno));

        return;
    }

    sync();
    cpuStatGet();
    st = getTime();
    for(n = 0; n < diskNum; n++){
        if((nb = write(f, buf, bufSize)) != bufSize)
            fprintf(stderr, "WriteError: %d\n", nb);
    }

    if(doFsync)
        fsync(f);

    et = getTime();
    cpuStatGet();

    cpuStatEnd.user = cpuStatEnd.user - cpuStatStart.user;
    cpuStatEnd.nice = cpuStatEnd.nice - cpuStatStart.nice;
    cpuStatEnd.sys = cpuStatEnd.sys - cpuStatStart.sys;
    cpuStatEnd.idle = cpuStatEnd.idle - cpuStatStart.idle;
    cpuStatEnd.wait = cpuStatEnd.wait - cpuStatStart.wait;
    cpuStatEnd.hi = cpuStatEnd.hi - cpuStatStart.hi;
    cpuStatEnd.si = cpuStatEnd.si - cpuStatStart.si;

    cpuUsed = (cpuStatEnd.user + cpuStatEnd.nice + cpuStatEnd.sys + 
cpuStatEnd.hi + cpuStatEnd.si);

    cpuUsage = cpuUsed / (cpuUsed + cpuStatEnd.idle);

    r = (double

Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-12 Thread Terry Barnaby

On 12/02/18 17:15, J. Bruce Fields wrote:

On Mon, Feb 12, 2018 at 05:09:32PM +, Terry Barnaby wrote:

One thing on this, that I forgot to ask, doesn't fsync() work properly with
an NFS server side async mount then ?

No.

If a server sets "async" on an export, there is absolutely no way for a
client to guarantee that data reaches disk, or to know when it happens.

Possibly "ignore_sync", or "unsafe_sync", or something else, would be a
better name.

--b.


Well that seems like a major drop off, I always thought that fsync() 
would work in this case. I don't understand why fsync() should not 
operate as intended ? Sounds like this NFS async thing needs some work !



I still do not understand why NFS doesn't operate in the same way as a 
standard mount on this. The use for async is only for improved 
performance due to disk write latency and speed (or are there other 
reasons ?)


So with a local system mount:

async: normal mode: All system calls manipulate in buffer memory disk 
structure (inodes etc). Data/Metadata is flushed to disk on fsync(), 
sync() and occasionally by kernel. Processes data is not actually stored 
until fsync(), sync() etc.


sync: with sync option. Data/metadata is written to disk before system 
calls return (all FS system calls ?).



With an NFS mount I would have thought it should be the same.

async: normal mode: All system calls manipulate in buffer memory disk 
structure (inodes etc) this would normally be on the server (so multiple 
clients can work with the same data) but with some options (particular 
usage) maybe client side write buffering/caching could be used (ie. data 
would not actually pass to server during every FS system call). 
Data/Metadata is flushed to server disk on fsync(), sync() and 
occasionally by kernel (If client side write caching is used flushes 
across network and then flushes server buffers). Processes data is not 
actually stored until fsync(), sync() etc.


sync: with client side sync option. Data/metadata is written across NFS 
and to Server disk before system calls return (all FS system calls ?).


I really don't understand why the async option is implemented on the 
server export although a sync option here could force sync for all 
clients for that mount. What am I missing ? Is there some good reason 
(rather than history) it is done this way ?

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-12 Thread Terry Barnaby

On 12/02/18 17:06, J. Bruce Fields wrote:

On Mon, Feb 12, 2018 at 09:08:47AM +, Terry Barnaby wrote:

On 09/02/18 08:25, nicolas.mail...@laposte.net wrote:

- Mail original -
De: "Terry Barnaby"

If
it was important to get the data to disk it would have been using
fsync(), FS sync, or some other transaction based app

??? Many people use NFS NAS because doing RAID+Backup on every client is too 
expensive. So yes, they *are* using NFS because it is important to get the data 
to disk.

Regards,


Yes, that is why I said some people would be using "FS sync". These people
would use the sync option, but then they would use "sync" mount option,
(ideally this would be set on the NFS client as the clients know they need
this).

The "sync" mount option should not be necessary for data safety.
Carefully written apps know how to use fsync() and related calls at
points where they need data to be durable.

The server-side "async" export option, on the other hand, undermines
exactly those calls and therefore can result in lost or corrupted data
on a server crash, no matter how careful the application.

Again, we need to be very careful to distinguish between the client-side
"sync" mount option and the server-side "sync" export option.

--b.


One thing on this, that I forgot to ask, doesn't fsync() work properly 
with an NFS server side async mount then ? I would have thought this 
would still work correctly.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-12 Thread Terry Barnaby

On 09/02/18 08:25, nicolas.mail...@laposte.net wrote:


- Mail original -
De: "Terry Barnaby"


If
it was important to get the data to disk it would have been using
fsync(), FS sync, or some other transaction based app

??? Many people use NFS NAS because doing RAID+Backup on every client is too 
expensive. So yes, they *are* using NFS because it is important to get the data 
to disk.

Regards,

Yes, that is why I said some people would be using "FS sync". These 
people would use the sync option, but then they would use "sync" mount 
option, (ideally this would be set on the NFS client as the clients know 
they need this). Personally we use rsync via a rsync server or over ssh 
for backups like this as NFS sync would be far too slow and rsync 
provides an easy incremental mode plus other benefits.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-08 Thread Terry Barnaby

On 06/02/18 21:48, J. Bruce Fields wrote:

On Tue, Feb 06, 2018 at 08:18:27PM +, Terry Barnaby wrote:

Well, when a program running on a system calls open(), write() etc. to the
local disk FS the disk's contents is not actually updated. The data is in
server buffers until the next sync/fsync or some time has passed. So, in
your parlance, the OS write() call lies to the program. So it is by default
async unless the "sync" mount option is used when mounting the particular
file system in question.

That's right, but note applications are written with the knowledge that
OS's behave this way, and are given tools (sync, fsync, etc.) to manage
this behavior so that they still have some control over what survives a
crash.

(But sync & friends no longer do what they're supposed to on an Linux
server exporting with async.)
Doesn't fsync() and perhaps sync() work across NFS then when the server 
has an async export, I thought they did along with file locking to some 
extent ?

Although it is different from the current NFS settings methods, I would have
thought that this should be the same for NFS. So if a client mounts a file
system normally it is async, ie write() data is in buffers somewhere (client
or server) unless the client mounts the file system in sync mode.

In fact, this is pretty much how it works, for write().

It didn't used to be that way--NFSv2 writes were all synchronous.

The problem is that if a server power cycles while it still had dirty
data in its caches, what should you do?
You can't ignore it--you'd just be silently losing data.  You could
return an error at some point, but "we just lost some or your idea, no
idea what" isn't an error an application can really act on.
Yes, it is tricky error handling. But what does a program do when its 
local hard disk disk or machine dies underneath it anyway ? I don't 
think a program on a remote system is particularly worse off if the NFS 
server dies, it may have to die if it can't do any special recovery. If 
it was important to get the data to disk it would have been using 
fsync(), FS sync, or some other transaction based approach, indeed it 
shouldn't be using network remote disk mounts anyway. It all depends on 
what the program is doing and its usage requirements. A cc failing one 
in a blue moon is not a real issue (as long as it fails and removes its 
created files or at least a make clean can be run). As I have said I 
have used NFS async for about 27+ years on multiple systems with no 
problems when servers die with the type of usage I use NFS for. The 
number of times a server has died is low in that time. Client systems 
have died many many more times (User issues, experimental 
programs/kernels, random program usage, single cheap disks, cheaper non 
ECC RAM etc.)

So NFSv3 introduced a separation of write into WRITE and COMMIT.  The
client first sends a WRITE with the data, then latter sends a COMMIT
call that says "please don't return till that data I sent before is
actually on disk".

If the server reboots, there's a limited set of data that the client
needs to resend to recover (just data that's been written but not
committed.)

But we only have that for file data, metadata would be more complicated,
so stuff like file creates, setattr, directory operations, etc., are
still synchronous.


Only difference from the normal FS conventions I am suggesting is to
allow the server to stipulate "sync" on its mount that forces sync
mode for all clients on that FS.

Anyway, we don't have protocol to tell clients to do that.

As I said NFSv4.3 :)



In the case of a /home mount for example, or a source code build file
system, it is normally only one client that is accessing the dir and if a
write fails due to the server going down (an unlikely occurrence, its not
much of an issue. I have only had this happen a couple of times in 28 years
and then with no significant issues (power outage, disk fail pre-raid etc.).

So if you have reliable servers and power, maybe you're comfortable with
the risk.  There's a reason that's not the default, though.
Well, it is the default for local FS mounts so I really don't see why it 
should be different for network mounts. But anyway for my usage NFS sync 
is completely unusable (as would local sync mounts) so it has to be 
async NFS or local disks (13 secs local disk -> 3mins NFS async-> 2 
hours NFS sync). I would have thought that would go for the majority of 
NFS usage. No issue to me though as long as async can be configured and 
works well :)



4. The 0.5ms RPC latency seems a bit high (ICMP pings 0.12ms) . Maybe this
is worth investigating in the Linux kernel processing (how ?) ?

Yes, that'd be interesting to investigate.  With some kernel tracing I
think it should be possible to get high-resolution timings for the
processing of a single RPC call, which would make a good start.

It'd probably also interesting to start with the simplest possible RPC
and t

Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-06 Thread Terry Barnaby

On 06/02/18 18:55, J. Bruce Fields wrote:

On Tue, Feb 06, 2018 at 06:49:28PM +, Terry Barnaby wrote:

On 05/02/18 14:52, J. Bruce Fields wrote:

Yet another poor NFSv3 performance issue. If I do a "ls -lR" of a certain
NFS mounted directory over a slow link (NFS over Openvpn over FTTP
80/20Mbps), just after mounting the file system (default NFSv4 mount with
async), it takes about 9 seconds. If I run the same "ls -lR" again, just
after, it takes about 60 seconds.

A wireshark trace might help.

Also, is it possible some process is writing while this is happening?

--b.


Ok, I have made some wireshark traces and put these at:

https://www.beam.ltd.uk/files/files//nfs/

There are other processing running obviously, but nothing that should be
doing anything that should really affect this.

As a naive input, it looks like the client is using a cache but checking the
update times of each file individually using GETATTR. As it is using a
simple GETATTR per file in each directory the latency of these RPC calls is
mounting up. I guess it would be possible to check the cache status of all
files in a dir at once with one call that would allow this to be faster when
a full readdir is in progress, like a "GETATTR_DIR " RPC call. The
overhead of the extra data would probably not affect a single file check
cache time as latency rather than amount of data is the killer.

Yeah, that's effectively what READDIR is--it can request attributes
along with the directory entries.  (In NFSv4--in NFSv3 there's a
seperate call called READDIR_PLUS that gets attributes.)

So the client needs some heuristics to decide when to do a lot of
GETATTRs and when to instead do READDIR.  Those heuristics have gotten
some tweaking over time.

What kernel version is your client on again?

--b.


System is Fedora27, Kernel is: 4.14.16-300.fc27.x86_64 on both client 
and server.


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-06 Thread Terry Barnaby

On 05/02/18 23:06, J. Bruce Fields wrote:

On Thu, Feb 01, 2018 at 08:29:49AM +, Terry Barnaby wrote:

1. Have an OPEN-SETATTR-WRITE RPC call all in one and a SETATTR-CLOSE call
all in one. This would reduce the latency of a small file to 1ms rather than
3ms thus 66% faster. Would require the client to delay the OPEN/SETATTR
until the first WRITE. Not sure how possible this is in the implementations.
Maybe READ's could be improved as well but getting the OPEN through quick
may be better in this case ?

2. Could go further with an OPEN-SETATTR-WRITE-CLOSE RPC call. (0.5ms vs
3ms).

The protocol doesn't currently let us delay the OPEN like that,
unfortunately.
Yes, should have thought of that, to focused on network traces and not 
thinking about the program/OS API :)

But maybe  OPEN-SETATTR and SETATTR-CLOSE would be possible.


What we can do that might help: we can grant a write delegation in the
reply to the OPEN.  In theory that should allow the following operations
to be performed asynchronously, so the untar can immediately issue the
next OPEN without waiting.  (In practice I'm not sure what the current
client will do.)

I'm expecting to get to write delegations this year

It probably wouldn't be hard to hack the server to return write
delegations even when that's not necessarily correct, just to get an
idea what kind of speedup is available here.
That sounds good. I will have to read up on NFS write delegations, not 
sure how they work. I guess write() errors would be returned later than 
they actually occurred etc. ?



3. On sync/async modes personally I think it would be better for the client
to request the mount in sync/async mode. The setting of sync on the server
side would just enforce sync mode for all clients. If the server is in the
default async mode clients can mount using sync or async as to their
requirements. This seems to match normal VFS semantics and usage patterns
better.

The client-side and server-side options are both named "sync", but they
aren't really related.  The server-side "async" export option causes the
server to lie to clients, telling them that data has reached disk even
when it hasn't.  This affects all clients, whether they mounted with
"sync" or "async".  It violates the NFS specs, so it is not the default.

I don't understand your proposal.  It sounds like you believe that
mounting on the client side with the "sync" option will make your data
safe even if the "async" option is set on the server side?
Unfortunately that's not how it works.
Well, when a program running on a system calls open(), write() etc. to 
the local disk FS the disk's contents is not actually updated. The data 
is in server buffers until the next sync/fsync or some time has passed. 
So, in your parlance, the OS write() call lies to the program. So it is 
by default async unless the "sync" mount option is used when mounting 
the particular file system in question.


Although it is different from the current NFS settings methods, I would 
have thought that this should be the same for NFS. So if a client mounts 
a file system normally it is async, ie write() data is in buffers 
somewhere (client or server) unless the client mounts the file system in 
sync mode. Only difference from the normal FS conventions I am 
suggesting is to allow the server to stipulate "sync" on its mount that 
forces sync mode for all clients on that FS. I know it is different from 
standard NFS config but it just seems more logical to me :) The 
sync/async option and the ramifications of it are really dependent of 
the clients usage in most cases.


In the case of a /home mount for example, or a source code build file 
system, it is normally only one client that is accessing the dir and if 
a write fails due to the server going down (an unlikely occurrence, its 
not much of an issue. I have only had this happen a couple of times in 
28 years and then with no significant issues (power outage, disk fail 
pre-raid etc.).


I know that is not how NFS currently "works", it just seems illogical to 
me they way it currently does work :)






4. The 0.5ms RPC latency seems a bit high (ICMP pings 0.12ms) . Maybe this
is worth investigating in the Linux kernel processing (how ?) ?

Yes, that'd be interesting to investigate.  With some kernel tracing I
think it should be possible to get high-resolution timings for the
processing of a single RPC call, which would make a good start.

It'd probably also interesting to start with the simplest possible RPC
and then work our way up and see when the RTT increases the most--e.g
does an RPC ping (an RPC with procedure 0, empty argument and reply)
already have a round-trip time closer to .5ms or .12ms?
Any pointers to trying this ? I have a small amount of time as work is 
quiet at the moment.



5. The 20ms RPC latency I see in sync mode needs a look at on my system
although async mode is 

Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-06 Thread Terry Barnaby
1509
TSecr=2646321748
     689 4.715524465    192.168.201.1 192.168.202.2 TCP  1405
2049 → 679 [ACK] Seq=314555 Ack=47237 Win=1452 Len=1337 TSval=2646321749
TSecr=913651486 [TCP segment of a reassembled PDU]
     690 4.715911571    192.168.201.1 192.168.202.2 NFS  1449
V4 Reply (Call In 685) READDIR

NFS directory reads later:

No. Time   Source Destination   Protocol Length Info
     664 9.485593049    192.168.202.2 192.168.201.1 NFS  304
V4 Call (Reply In 669) READDIR FH: 0x1933e99e
     665 9.507596250    192.168.201.1 192.168.202.2 TCP  1405
2049 → 788 [ACK] Seq=127921 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
     666 9.507717425    192.168.201.1 192.168.202.2 TCP  1405
2049 → 788 [ACK] Seq=129258 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
     667 9.507733352    192.168.202.2 192.168.201.1 TCP  68
788 → 2049 [ACK] Seq=65730 Ack=130595 Win=1444 Len=0 TSval=913106338
TSecr=2645776572
     668 9.507987020    192.168.201.1 192.168.202.2 TCP  1405
2049 → 788 [ACK] Seq=130595 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
     669 9.508456847    192.168.201.1 192.168.202.2 NFS  989
V4 Reply (Call In 664) READDIR
     670 9.508472149    192.168.202.2 192.168.201.1 TCP  68
788 → 2049 [ACK] Seq=65730 Ack=132853 Win=1444 Len=0 TSval=913106338
TSecr=2645776572
     671 9.508880627    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 672) GETATTR FH: 0x7e9e8300
     672 9.530375865    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 671) GETATTR
     673 9.530564317    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 674) GETATTR FH: 0xcb837ac9
     674 9.551906321    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 673) GETATTR
     675 9.552064038    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 676) GETATTR FH: 0xbf951d32
     676 9.574210528    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 675) GETATTR
     677 9.574334117    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 678) GETATTR FH: 0xd3f3dc3e
     678 9.595902902    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 677) GETATTR
     679 9.596025484    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 680) GETATTR FH: 0xf534332a
     680 9.617497794    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 679) GETATTR
     681 9.617621218    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 682) GETATTR FH: 0xa7e5bbc5
     682 9.639157371    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 681) GETATTR
     683 9.639279098    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 684) GETATTR FH: 0xa8050515
     684 9.660669335    192.168.201.1 192.168.202.2 NFS  312
V4 Reply (Call In 683) GETATTR
     685 9.660787725    192.168.202.2 192.168.201.1 NFS  304
V4 Call (Reply In 686) READDIR FH: 0x7e9e8300
     686 9.682612756    192.168.201.1 192.168.202.2 NFS  1472
V4 Reply (Call In 685) READDIR
     687 9.682646761    192.168.202.2 192.168.201.1 TCP  68
788 → 2049 [ACK] Seq=67450 Ack=135965 Win=1444 Len=0 TSval=913106513
TSecr=2645776747
     688 9.682906293    192.168.202.2 192.168.201.1 NFS  280
V4 Call (Reply In 689) GETATTR FH: 0xa8050515

Lots of GETATTR calls the second time around (each file ?).

Really NFS is really broken performance wise these days and it "appears"
that significant/huge improvements are possible.

Anyone know what group/who is responsible for NFS protocol these days ?

Also what group/who is responsible for the Linux kernel's implementation of
it ?


--

Dr Terry BarnabyBEAM Ltd
Phone: +44 1454 324512  Northavon Business Center,
Email: te...@beam.ltd.ukDean Rd, Yate
Web: www.beam.ltd.ukBristol, BS37 5NH, UK
BEAM Engineering: Instrumentation, Electronics/Software/Systems
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-05 Thread Terry Barnaby

On 01/02/18 08:29, Terry Barnaby wrote:

On 01/02/18 01:34, Jeremy Linton wrote:

On 01/31/2018 09:49 AM, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 01:52:49PM -0600, Jeremy Linton wrote:

Have you tried this with a '-o nfsvers=3' during mount? Did that help?

I noticed a large decrease in my kernel build times across NFS/lan 
a while
back after a machine/kernel/10g upgrade. After playing with 
mount/export
options filesystem tuning/etc, I got to this point of timing a 
bunch of
these operations vs the older machine, at which point I discovered 
that

simply backing down to NFSv3 solved the problem.

AKA a nfsv3 server on a 10 year old 4 disk xfs RAID5 on 1Gb 
ethernet, was
slower than a modern machine with a 8 disk xfs RAID5 on 10Gb on 
nfsv4. The
effect was enough to change a kernel build from ~45 minutes down to 
less

than 5.


Using NFSv3 in async mode is faster than NFSv4 in async mode (still 
abysmal in sync mode).


NFSv3 async: sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; 
sync)


real    2m25.717s
user    0m8.739s
sys 0m13.362s

NFSv4 async: sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; 
sync)


real    3m33.032s
user    0m8.506s
sys 0m16.930s

NFSv3 async: wireshark trace

No. Time   Source Destination   Protocol Length Info
  18527 2.815884979    192.168.202.2 192.168.202.1 NFS  
216    V3 CREATE Call (Reply In 18528), DH: 0x62f39428/dma.h Mode: 
EXCLUSIVE
  18528 2.816362338    192.168.202.1 192.168.202.2 NFS  
328    V3 CREATE Reply (Call In 18527)
  18529 2.816418841    192.168.202.2 192.168.202.1 NFS  
224    V3 SETATTR Call (Reply In 18530), FH: 0x13678ba0
  18530 2.816871820    192.168.202.1 192.168.202.2 NFS  
216    V3 SETATTR Reply (Call In 18529)
  18531 2.816966771    192.168.202.2 192.168.202.1 NFS  
1148   V3 WRITE Call (Reply In 18532), FH: 0x13678ba0 Offset: 0 Len: 
934 FILE_SYNC
  18532 2.817441291    192.168.202.1 192.168.202.2 NFS  
208    V3 WRITE Reply (Call In 18531) Len: 934 FILE_SYNC
  18533 2.817495775    192.168.202.2 192.168.202.1 NFS  
236    V3 SETATTR Call (Reply In 18534), FH: 0x13678ba0
  18534 2.817920346    192.168.202.1 192.168.202.2 NFS  
216    V3 SETATTR Reply (Call In 18533)
  18535 2.818002910    192.168.202.2 192.168.202.1 NFS  
216    V3 CREATE Call (Reply In 18536), DH: 0x62f39428/elf.h Mode: 
EXCLUSIVE
  18536 2.818492126    192.168.202.1 192.168.202.2 NFS  
328    V3 CREATE Reply (Call In 18535)


This is taking about 2ms for a small file write rather than 3ms for 
NFSv4. There is an extra GETATTR and CLOSE RPC in NFSv4 accounting for 
the difference.


So where I am:

1. NFS in sync mode, at least on my two Fedora27 systems for my usage 
is completely unusable. (sync: 2 hours, async: 3 minutes, localdisk: 
13 seconds).


2. NFS async mode is working, but the small writes are still very slow.

3. NFS in async mode is 30% better with NFSv3 than NFSv4 when writing 
small files due to the increased latency caused by NFSv4's two extra 
RPC calls.


I really think that in 2018 we should be able to have better NFS 
performance when writing many small files such as used in software 
development. This would speed up any system that was using NFS with 
this sort of workload dramatically and reduce power usage all for some 
improvements in the NFS protocol.


I don't know the details of if this would work, or who is responsible 
for NFS, but it would be good if possible to have some improvements 
(NFSv4.3 ?). Maybe:


1. Have an OPEN-SETATTR-WRITE RPC call all in one and a SETATTR-CLOSE 
call all in one. This would reduce the latency of a small file to 1ms 
rather than 3ms thus 66% faster. Would require the client to delay the 
OPEN/SETATTR until the first WRITE. Not sure how possible this is in 
the implementations. Maybe READ's could be improved as well but 
getting the OPEN through quick may be better in this case ?


2. Could go further with an OPEN-SETATTR-WRITE-CLOSE RPC call. (0.5ms 
vs 3ms).


3. On sync/async modes personally I think it would be better for the 
client to request the mount in sync/async mode. The setting of sync on 
the server side would just enforce sync mode for all clients. If the 
server is in the default async mode clients can mount using sync or 
async as to their requirements. This seems to match normal VFS 
semantics and usage patterns better.


4. The 0.5ms RPC latency seems a bit high (ICMP pings 0.12ms) . Maybe 
this is worth investigating in the Linux kernel processing (how ?) ?


5. The 20ms RPC latency I see in sync mode needs a look at on my 
system although async mode is fine for my usage. Maybe this ends up as 
2 x 10ms drive seeks on ext4 and is thus expected.


Yet another poor NFSv3 performance issue. If I do a "ls -lR" of a 
certain NFS mounted directory over a slow link (NFS over Openvpn over 
FTTP 80/20Mbps), just after mountin

Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-02-01 Thread Terry Barnaby

On 01/02/18 01:34, Jeremy Linton wrote:

On 01/31/2018 09:49 AM, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 01:52:49PM -0600, Jeremy Linton wrote:

Have you tried this with a '-o nfsvers=3' during mount? Did that help?

I noticed a large decrease in my kernel build times across NFS/lan a 
while
back after a machine/kernel/10g upgrade. After playing with 
mount/export

options filesystem tuning/etc, I got to this point of timing a bunch of
these operations vs the older machine, at which point I discovered that
simply backing down to NFSv3 solved the problem.

AKA a nfsv3 server on a 10 year old 4 disk xfs RAID5 on 1Gb 
ethernet, was
slower than a modern machine with a 8 disk xfs RAID5 on 10Gb on 
nfsv4. The
effect was enough to change a kernel build from ~45 minutes down to 
less

than 5.


Using NFSv3 in async mode is faster than NFSv4 in async mode (still 
abysmal in sync mode).


NFSv3 async: sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; sync)

real    2m25.717s
user    0m8.739s
sys 0m13.362s

NFSv4 async: sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; sync)

real    3m33.032s
user    0m8.506s
sys 0m16.930s

NFSv3 async: wireshark trace

No. Time   Source Destination   Protocol Length Info
  18527 2.815884979    192.168.202.2 192.168.202.1 NFS  
216    V3 CREATE Call (Reply In 18528), DH: 0x62f39428/dma.h Mode: EXCLUSIVE
  18528 2.816362338    192.168.202.1 192.168.202.2 NFS  
328    V3 CREATE Reply (Call In 18527)
  18529 2.816418841    192.168.202.2 192.168.202.1 NFS  
224    V3 SETATTR Call (Reply In 18530), FH: 0x13678ba0
  18530 2.816871820    192.168.202.1 192.168.202.2 NFS  
216    V3 SETATTR Reply (Call In 18529)
  18531 2.816966771    192.168.202.2 192.168.202.1 NFS  
1148   V3 WRITE Call (Reply In 18532), FH: 0x13678ba0 Offset: 0 Len: 934 
FILE_SYNC
  18532 2.817441291    192.168.202.1 192.168.202.2 NFS  
208    V3 WRITE Reply (Call In 18531) Len: 934 FILE_SYNC
  18533 2.817495775    192.168.202.2 192.168.202.1 NFS  
236    V3 SETATTR Call (Reply In 18534), FH: 0x13678ba0
  18534 2.817920346    192.168.202.1 192.168.202.2 NFS  
216    V3 SETATTR Reply (Call In 18533)
  18535 2.818002910    192.168.202.2 192.168.202.1 NFS  
216    V3 CREATE Call (Reply In 18536), DH: 0x62f39428/elf.h Mode: EXCLUSIVE
  18536 2.818492126    192.168.202.1 192.168.202.2 NFS  
328    V3 CREATE Reply (Call In 18535)


This is taking about 2ms for a small file write rather than 3ms for 
NFSv4. There is an extra GETATTR and CLOSE RPC in NFSv4 accounting for 
the difference.


So where I am:

1. NFS in sync mode, at least on my two Fedora27 systems for my usage is 
completely unusable. (sync: 2 hours, async: 3 minutes, localdisk: 13 
seconds).


2. NFS async mode is working, but the small writes are still very slow.

3. NFS in async mode is 30% better with NFSv3 than NFSv4 when writing 
small files due to the increased latency caused by NFSv4's two extra RPC 
calls.


I really think that in 2018 we should be able to have better NFS 
performance when writing many small files such as used in software 
development. This would speed up any system that was using NFS with this 
sort of workload dramatically and reduce power usage all for some 
improvements in the NFS protocol.


I don't know the details of if this would work, or who is responsible 
for NFS, but it would be good if possible to have some improvements 
(NFSv4.3 ?). Maybe:


1. Have an OPEN-SETATTR-WRITE RPC call all in one and a SETATTR-CLOSE 
call all in one. This would reduce the latency of a small file to 1ms 
rather than 3ms thus 66% faster. Would require the client to delay the 
OPEN/SETATTR until the first WRITE. Not sure how possible this is in the 
implementations. Maybe READ's could be improved as well but getting the 
OPEN through quick may be better in this case ?


2. Could go further with an OPEN-SETATTR-WRITE-CLOSE RPC call. (0.5ms vs 
3ms).


3. On sync/async modes personally I think it would be better for the 
client to request the mount in sync/async mode. The setting of sync on 
the server side would just enforce sync mode for all clients. If the 
server is in the default async mode clients can mount using sync or 
async as to their requirements. This seems to match normal VFS semantics 
and usage patterns better.


4. The 0.5ms RPC latency seems a bit high (ICMP pings 0.12ms) . Maybe 
this is worth investigating in the Linux kernel processing (how ?) ?


5. The 20ms RPC latency I see in sync mode needs a look at on my system 
although async mode is fine for my usage. Maybe this ends up as 2 x 10ms 
drive seeks on ext4 and is thus expected.


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-30 Thread Terry Barnaby

On 30/01/18 21:31, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 07:03:17PM +, Terry Barnaby wrote:

It looks like each RPC call takes about 0.5ms. Why do there need to be some
many RPC calls for this ? The OPEN call could set the attribs, no need for
the later GETATTR or SETATTR calls.

The first SETATTR (which sets ctime and mtime to server's time) seems
unnecessary, maybe there's a client bug.

The second looks like tar's fault, strace shows it doing a utimensat()
on each file.  I don't know why or if that's optional.


Even the CLOSE could be integrated with the WRITE and taking this
further OPEN could do OPEN, SETATTR, and some WRITE all in one.

We'd probably need some new protocol to make it safe to return from the
open systemcall before we've gotten the OPEN reply from the server.

Write delegations might save us from having to wait for the other
operations.

Taking a look at my own setup, I see the same calls taking about 1ms.
The drives can't do that, so I've got a problem somewhere too

--b.


Also, on the 0.5ms. Is this effectively the 1ms system tick ie. the NFS 
processing is not processing based on the packet events (not 
pre-emptive) but on the next system tick ?


An ICMP ping is about 0.13ms (to and fro) between these systems. 
Although 0.5ms is relatively fast, I wouldn't have thought it should 
have to take 0.5ms for a minimal RPC even over TCPIP.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-30 Thread Terry Barnaby


Being a daredevil, I have used the NFS async option for 27 years 
without an issue on multiple systems :)


I have just mounted my ext4 disk with the same options you were using 
and the same NFS export options and the speed here looks the same as I 
had previously. As I can't wait 2+ hours so I'm just looking at 
ksysguard and it is showing a network rate of about 10 KBytes/s and 
the directory on the server is growing in size very very slowly.


This is using the current Fedora27 kernel 4.14.14-300.fc27.x86_64.

I will have a look at using wireshark to see if this shows anything.


This is a snippet from a wireshark trace of the NFS when untaring the 
linux kernel 4.14.15 sources into an NFSv4.2 mounted directory with 
"sync" option on my NFS server. The whole untar would take > 2 hours vs 
13 seconds direct to the disk. This is about 850 MBytes of 60k files. 
The following is a single, small file write.


No. Time   Source Destination   Protocol Length Info
   1880 11.928600315   192.168.202.2 192.168.202.1 NFS  
380    V4 Call (Reply In 1881) OPEN DH: 0xac0502f2/sysfs-c2port
   1881 11.950329198   192.168.202.1 192.168.202.2 NFS  
408    V4 Reply (Call In 1880) OPEN StateID: 0xaa72
   1882 11.950446430   192.168.202.2 192.168.202.1 NFS  
304    V4 Call (Reply In 1883) SETATTR FH: 0x825014ee
   1883 11.972608880   192.168.202.1 192.168.202.2 NFS  
336    V4 Reply (Call In 1882) SETATTR
   1884 11.972754709   192.168.202.2 192.168.202.1 TCP  
1516   785 → 2049 [ACK] Seq=465561 Ack=183381 Win=8990 Len=1448 
TSval=1663691771 TSecr=3103357902 [TCP segment of a reassembled PDU]
   1885 11.972763078   192.168.202.2 192.168.202.1 TCP  
1516   785 → 2049 [ACK] Seq=467009 Ack=183381 Win=8990 Len=1448 
TSval=1663691771 TSecr=3103357902 [TCP segment of a reassembled PDU]
   1886 11.972979437   192.168.202.2 192.168.202.1 NFS  
332    V4 Call (Reply In 1888) WRITE StateID: 0xafdf Offset: 0 Len: 2931
   1887 11.973074490   192.168.202.1 192.168.202.2 TCP  
68 2049 → 785 [ACK] Seq=183381 Ack=468721 Win=24557 Len=0 
TSval=3103357902 TSecr=1663691771
   1888 12.017153631   192.168.202.1 192.168.202.2 NFS  
248    V4 Reply (Call In 1886) WRITE
   1889 12.017338766   192.168.202.2 192.168.202.1 NFS  
260    V4 Call (Reply In 1890) GETATTR FH: 0x825014ee
   1890 12.017834411   192.168.202.1 192.168.202.2 NFS  
312    V4 Reply (Call In 1889) GETATTR
   1891 12.017961690   192.168.202.2 192.168.202.1 NFS  
328    V4 Call (Reply In 1892) SETATTR FH: 0x825014ee
   1892 12.039456634   192.168.202.1 192.168.202.2 NFS  
336    V4 Reply (Call In 1891) SETATTR
   1893 12.039536705   192.168.202.2 192.168.202.1 NFS  
284    V4 Call (Reply In 1894) CLOSE StateID: 0xaa72
   1894 12.039979528   192.168.202.1 192.168.202.2 NFS  
248    V4 Reply (Call In 1893) CLOSE
   1895 12.040077180   192.168.202.2 192.168.202.1 NFS  
392    V4 Call (Reply In 1896) OPEN DH: 0xac0502f2/sysfs-cfq-target-latency
   1896 12.061903798   192.168.202.1 192.168.202.2 NFS  
408    V4 Reply (Call In 1895) OPEN StateID: 0xaa72


It looks like this takes about 100ms to write this small file. With the 
approx 60k files in the archive this would take about 6000 secs, so is 
in the 2 hours ballpark or the untar that I am seeing.


Looks like OPEN 21ms, SETATTR 22ms, WRITE 44ms, second SETATTR 21ms a 
lot of time ...


The following is for an "async" mount:

No. Time   Source Destination   Protocol Length Info
  37393 7.630012608    192.168.202.2 192.168.202.1 NFS  
396    V4 Call (Reply In 37394) OPEN DH: 
0x1f828ac9/vidioc-dbg-g-chip-info.rst
  37394 7.630488451    192.168.202.1 192.168.202.2 NFS  
408    V4 Reply (Call In 37393) OPEN StateID: 0xaa72
  37395 7.630525117    192.168.202.2 192.168.202.1 NFS  
304    V4 Call (Reply In 37396) SETATTR FH: 0x0f65c554
  37396 7.630980560    192.168.202.1 192.168.202.2 NFS  
336    V4 Reply (Call In 37395) SETATTR
  37397 7.631035171    192.168.202.2 192.168.202.1 TCP  
1516   785 → 2049 [ACK] Seq=13054241 Ack=3620329 Win=8990 Len=1448 
TSval=1664595527 TSecr=3104261711 [TCP segment of a reassembled PDU]
  37398 7.631038994    192.168.202.2 192.168.202.1 TCP  
1516   785 → 2049 [ACK] Seq=13055689 Ack=3620329 Win=8990 Len=1448 
TSval=1664595527 TSecr=3104261711 [TCP segment of a reassembled PDU]
  37399 7.631042228    192.168.202.2 192.168.202.1 TCP  
1516   785 → 2049 [ACK] Seq=13057137 Ack=3620329 Win=8990 Len=1448 
TSval=1664595527 TSecr=3104261711 [TCP segment of a reassembled PDU]
  37400 7.631195554    192.168.202.2 192.168.202.1 NFS  
448    V4 Call (Reply In 37402) WRITE StateID: 0xafdf Offset: 0 Len: 4493
  37401 7.631277423    192.168.202.1 

Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-30 Thread Terry Barnaby

On 30/01/18 17:54, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 12:31:22PM -0500, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 04:49:41PM +, Terry Barnaby wrote:

I have just tried running the untar on our work systems. These are again
Fedora27 but newer hardware.
I set one of the servers NFS exports to just rw (removed the async option in
/etc/exports and ran exportfs -arv).
Remounted this NFS file system on a Fedora27 client and re-ran the test. I
have only waited 10mins but the overal network data rate is in the order of
0.1 MBytes/sec so it looks like it will be a multiple hour job as at home.
So I have two completely separate systems with the same performance over
NFS.
With your NFS "sync" test are you sure you set the "sync" mode on the server
and re-exported the file systems ?

Not being a daredevil, I use "sync" by default:

# exportfs -v /export 
(rw,sync,wdelay,hide,no_subtree_check,sec=sys,insecure,no_root_squash,no_all_squash)

For the "async" case I changed the options and actually rebooted, yes.

The filesystem is:

/dev/mapper/export-export on /export type ext4 
(rw,relatime,seclabel,nodelalloc,stripe=32,data=journal)

(I think data=journal is the only non-default, and I don't remember why
I chose that.)

Hah, well, with data=ordered (the default) the same untar (with "sync"
export) took 15m38s.  So... that probably wasn't an accident.

It may be irresponsible for me to guess given the state of my ignorance
about ext4 journaling, but perhaps writing everything to the journal and
delaying writing it out to its real location as long as possible allows
some sort of tradeoff between bandwidth and seeks that helps with this
sync-heavy workload.

--b.


Being a daredevil, I have used the NFS async option for 27 years without 
an issue on multiple systems :)


I have just mounted my ext4 disk with the same options you were using 
and the same NFS export options and the speed here looks the same as I 
had previously. As I can't wait 2+ hours so I'm just looking at 
ksysguard and it is showing a network rate of about 10 KBytes/s and the 
directory on the server is growing in size very very slowly.


This is using the current Fedora27 kernel 4.14.14-300.fc27.x86_64.

I will have a look at using wireshark to see if this shows anything.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-30 Thread Terry Barnaby

On 30/01/18 16:22, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 03:29:41PM +, Terry Barnaby wrote:

On 30/01/18 15:09, J. Bruce Fields wrote:

By comparison on my little home server (Fedora, ext4, a couple WD Black
1TB drives), with sync, that untar takes is 7:44, about 8ms/file.

Ok, that is far more reasonable, so something is up on my systems :)
What speed do you get with the server export set to async ?

I tried just now and got 4m2s.

The drives probably still have to do a seek or two per create, the
difference now is that we don't have to wait for one create to start the
next one, so the drives can work in parallel.

So given that I'm striping across two drives, I *think* it makes sense
that I'm getting about double the performance with the async export
option.

But that doesn't explain the difference between async and local
performance (22s when I tried the same untar directly on the server, 25s
when I included a final sync in the timing).  And your numbers are a
complete mystery.
I have just tried running the untar on our work systems. These are again 
Fedora27 but newer hardware.
I set one of the servers NFS exports to just rw (removed the async 
option in /etc/exports and ran exportfs -arv).
Remounted this NFS file system on a Fedora27 client and re-ran the test. 
I have only waited 10mins but the overal network data rate is in the 
order of 0.1 MBytes/sec so it looks like it will be a multiple hour job 
as at home.
So I have two completely separate systems with the same performance over 
NFS.
With your NFS "sync" test are you sure you set the "sync" mode on the 
server and re-exported the file systems ?




--b.


What's the disk configuration and what filesystem is this?

Those tests above were to a single: SATA Western Digital Red 3TB, WDC
WD30EFRX-68EUZN0 using ext4.
Most of my tests have been to software RAID1 SATA disks, Western Digital Red
2TB on one server and Western Digital RE4 2TB WDC WD2003FYYS-02W0B1 on
another quad core Xeon server all using ext4 and all having plenty of RAM.
All on stock Fedora27 (both server and client) updated to date.


Is it really expected for NFS to be this bad these days with a reasonably
typical operation and are there no other tuning parameters that can help  ?

It's expected that the performance of single-threaded file creates will
depend on latency, not bandwidth.

I believe high-performance servers use battery backed write caches with
storage behind them that can do lots of IOPS.

(One thing I've been curious about is whether you could get better
performance cheap on this kind of workload ext3/4 striped across a few
drives and an external journal on SSD.  But when I experimented with
that a few years ago I found synchronous write latency wasn't much
better.  I didn't investigate why not, maybe that's just the way SSDs
are.)

--b.



___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-30 Thread Terry Barnaby

On 30/01/18 15:09, J. Bruce Fields wrote:

On Tue, Jan 30, 2018 at 08:49:27AM +, Terry Barnaby wrote:

On 29/01/18 22:28, J. Bruce Fields wrote:

On Mon, Jan 29, 2018 at 08:37:50PM +, Terry Barnaby wrote:

Ok, that's a shame unless NFSv4's write performance with small files/dirs
is relatively ok which it isn't on my systems.
Although async was "unsafe" this was not an issue in main standard
scenarios such as an NFS mounted home directory only being used by one
client.
The async option also does not appear to work when using NFSv3. I guess it
was removed from that protocol at some point as well ?

This isn't related to the NFS protocol version.

I think everybody's confusing the server-side "async" export option with
the client-side mount "async" option.  They're not really related.

The unsafe thing that speeds up file creates is the server-side "async"
option.  Sounds like you tried to use the client-side mount option
instead, which wouldn't do anything.


What is the expected sort of write performance when un-taring, for example,
the linux kernel sources ? Is 2 MBytes/sec on average on a Gigabit link
typical (3 mins to untar 4.14.15) or should it be better ?

It's not bandwidth that matters, it's latency.

The file create isn't allowed to return until the server has created the
file and the change has actually reached disk.

So an RPC has to reach the server, which has to wait for disk, and then
the client has to get the RPC reply.  Usually it's the disk latency that
dominates.

And also the final close after the new file is written can't return
until all the new file data has reached disk.

v4.14.15 has 61305 files:

$ git ls-tree -r  v4.14.15|wc -l
61305

So time to create each file was about 3 minutes/61305 =~ 3ms.

So assuming two roundtrips per file, your disk latency is probably about
1.5ms?

You can improve the storage latency somehow (e.g. with a battery-backed
write cache) or use more parallelism (has anyone ever tried to write a
parallel untar?).  Or you can cheat and set the async export option, and
then the server will no longer wait for disk before replying.  The
problem is that on server reboot/crash, the client's assumptions about
which operations succeeded may turn out to be wrong.

--b.

Many thanks for your reply.

Yes, I understand the above (latency and normally synchronous nature of
NFS). I have async defined in the servers /etc/exports options. I have,
later, also defined it on the client side as the async option on the server
did not appear to be working and I wondered if with ongoing changes it had
been moved there (would make some sense for the client to define it and pass
this option over to the server as it knows, in most cases, if the bad
aspects of async would be an issue to its usage in the situation in
question).

It's a server with large disks, so SSD is not really an option. The use of
async is ok for my usage (mainly /home mounted and users home files only in
use by one client at a time etc etc.).

Note it's not concurrent access that will cause problems, it's server
crashes.  A UPS may reduce the risk a little.


However I have just found that async is actually working! I just did not
believe it was, due to the poor write performance. Without async on the
server the performance is truly abysmal. The figures I get for untaring the
kernel sources (4.14.15 895MBytes untared) using "rm -fr linux-4.14.15;
sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; sync)" are:

Untar on server to its local disk:  13 seconds, effective data rate: 68
MBytes/s

Untar on server over NFSv4.2 with async on server:  3 minutes, effective
data rate: 4.9 MBytes/sec

Untar on server over NFSv4.2 without async on server:  2 hours 12 minutes,
effective data rate: 115 kBytes/s !!

2:12 is 7920 seconds, and you've got 61305 files to write, so that's
about 130ms/file.  That's more than I'd expect even if you're waiting
for a few seeks on each file create, so there may indeed be something
wrong.

By comparison on my little home server (Fedora, ext4, a couple WD Black
1TB drives), with sync, that untar takes is 7:44, about 8ms/file.

Ok, that is far more reasonable, so something is up on my systems :)
What speed do you get with the server export set to async ?


What's the disk configuration and what filesystem is this?
Those tests above were to a single: SATA Western Digital Red 3TB, WDC 
WD30EFRX-68EUZN0 using ext4.
Most of my tests have been to software RAID1 SATA disks, Western Digital 
Red 2TB on one server and Western Digital RE4 2TB WDC WD2003FYYS-02W0B1 
on another quad core Xeon server all using ext4 and all having plenty of 
RAM.

All on stock Fedora27 (both server and client) updated to date.




Is it really expected for NFS to be this bad these days with a reasonably
typical operation and are there no other tuning parameters that can help  ?

It's expected that the performance of single-threaded file creates

Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-30 Thread Terry Barnaby

On 29/01/18 22:28, J. Bruce Fields wrote:

On Mon, Jan 29, 2018 at 08:37:50PM +, Terry Barnaby wrote:

Ok, that's a shame unless NFSv4's write performance with small files/dirs
is relatively ok which it isn't on my systems.
Although async was "unsafe" this was not an issue in main standard
scenarios such as an NFS mounted home directory only being used by one
client.
The async option also does not appear to work when using NFSv3. I guess it
was removed from that protocol at some point as well ?

This isn't related to the NFS protocol version.

I think everybody's confusing the server-side "async" export option with
the client-side mount "async" option.  They're not really related.

The unsafe thing that speeds up file creates is the server-side "async"
option.  Sounds like you tried to use the client-side mount option
instead, which wouldn't do anything.


What is the expected sort of write performance when un-taring, for example,
the linux kernel sources ? Is 2 MBytes/sec on average on a Gigabit link
typical (3 mins to untar 4.14.15) or should it be better ?

It's not bandwidth that matters, it's latency.

The file create isn't allowed to return until the server has created the
file and the change has actually reached disk.

So an RPC has to reach the server, which has to wait for disk, and then
the client has to get the RPC reply.  Usually it's the disk latency that
dominates.

And also the final close after the new file is written can't return
until all the new file data has reached disk.

v4.14.15 has 61305 files:

$ git ls-tree -r  v4.14.15|wc -l
61305

So time to create each file was about 3 minutes/61305 =~ 3ms.

So assuming two roundtrips per file, your disk latency is probably about
1.5ms?

You can improve the storage latency somehow (e.g. with a battery-backed
write cache) or use more parallelism (has anyone ever tried to write a
parallel untar?).  Or you can cheat and set the async export option, and
then the server will no longer wait for disk before replying.  The
problem is that on server reboot/crash, the client's assumptions about
which operations succeeded may turn out to be wrong.

--b.


Many thanks for your reply.

Yes, I understand the above (latency and normally synchronous nature of 
NFS). I have async defined in the servers /etc/exports options. I have, 
later, also defined it on the client side as the async option on the 
server did not appear to be working and I wondered if with ongoing 
changes it had been moved there (would make some sense for the client to 
define it and pass this option over to the server as it knows, in most 
cases, if the bad aspects of async would be an issue to its usage in the 
situation in question).


It's a server with large disks, so SSD is not really an option. The use 
of async is ok for my usage (mainly /home mounted and users home files 
only in use by one client at a time etc etc.).


However I have just found that async is actually working! I just did not 
believe it was, due to the poor write performance. Without async on the 
server the performance is truly abysmal. The figures I get for untaring 
the kernel sources (4.14.15 895MBytes untared) using "rm -fr 
linux-4.14.15; sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; 
sync)" are:


Untar on server to its local disk:  13 seconds, effective data rate: 68 
MBytes/s


Untar on server over NFSv4.2 with async on server:  3 minutes, effective 
data rate: 4.9 MBytes/sec


Untar on server over NFSv4.2 without async on server:  2 hours 12 
minutes, effective data rate: 115 kBytes/s !!


Is it really expected for NFS to be this bad these days with a 
reasonably typical operation and are there no other tuning parameters 
that can help  ?

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-29 Thread Terry Barnaby

On 29/01/18 19:50, Steve Dickson wrote:


On 01/29/2018 12:42 PM, Steven Whitehouse wrote:



 Forwarded Message 
Subject:Re: Fedora27: NFS v4 terrible write performance, is async 
working
Date:   Sun, 28 Jan 2018 21:17:02 +
From:   Terry Barnaby <ter...@beam.ltd.uk>
To: Steven Whitehouse <swhit...@redhat.com>, Development discussions related to Fedora 
<devel@lists.fedoraproject.org>, Terry Barnaby <ter...@beam.ltd.uk>
CC: Steve Dickson <ste...@redhat.com>, Benjamin Coddington 
<bcodd...@redhat.com>



On 28/01/18 14:38, Steven Whitehouse wrote:

Hi,


On 28/01/18 07:48, Terry Barnaby wrote:

When doing a tar -xzf ... of a big source tar on an NFSv4 file system
the time taken is huge. I am seeing an overall data rate of about 1
MByte per second across the network interface. If I copy a single
large file I see a network data rate of about 110 MBytes/sec which is
about the limit of the Gigabit Ethernet interface I am using.

Now, in the past I have used the NFS "async" mount option to help
with write speed (lots of small files in the case of an untar of a
set of source files).

However, this does not seem to speed this up in Fedora27 and also I
don't see the "async" option listed when I run the "mount" command.
When I use the "sync" option it does show up in the "mount" list.

The question is, is the "async" option actually working with NFS v4
in Fedora27 ?

No. Its something left over from v3 that allowed servers to be unsafe.
With v4, the protocol defines stableness of the writes.

Thanks for the reply.

Ok, that's a shame unless NFSv4's write performance with small 
files/dirs is relatively ok which it isn't on my systems.
Although async was "unsafe" this was not an issue in main standard 
scenarios such as an NFS mounted home directory only being used by one 
client.
The async option also does not appear to work when using NFSv3. I guess 
it was removed from that protocol at some point as well ?



___

What server is in use? Is that Linux too? Also, is this v4.0 or v4.1?
I've copied in some of the NFS team who should be able to assist,

Steve.

Thanks for the reply.

Server is a Fedora27 as well. vers=4.2 the default. Same issue at other
sites with Fedora27.

Server export: "/data *.kingnet(rw,async,fsid=17)"

Client fstab: "king.kingnet:/data /data nfs async,nocto 0 0"

Client mount: "king.kingnet:/data on /data type nfs4
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,nocto,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.202.2,local_lock=none,addr=192.168.202.1)"



This looks normal except for setting fsid=17...

The best way to debug this is to open up a bugzilla report
and attached a (compressed) wireshark network trace to see
what is happening on the wire... The entire tar is not needed
just a good chunk...

steved.


Ok, will try doing the wireshark trace. What should I open a Bugzilla 
report against, the kernel ?


What is the expected sort of write performance when un-taring, for 
example, the linux kernel sources ? Is 2 MBytes/sec on average on a 
Gigabit link typical (3 mins to untar 4.14.15) or should it be better ?

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-28 Thread Terry Barnaby

On 28/01/18 15:47, Richard W.M. Jones wrote:

Please post questions on the users list:

https://lists.fedoraproject.org/admin/lists/users.lists.fedoraproject.org/

Sorry, will move there. Thought developers may be more into NFS that 
users in general these days there being no responses in the users list.

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance, is async working

2018-01-28 Thread Terry Barnaby

On 28/01/18 14:38, Steven Whitehouse wrote:

Hi,


On 28/01/18 07:48, Terry Barnaby wrote:
When doing a tar -xzf ... of a big source tar on an NFSv4 file system 
the time taken is huge. I am seeing an overall data rate of about 1 
MByte per second across the network interface. If I copy a single 
large file I see a network data rate of about 110 MBytes/sec which is 
about the limit of the Gigabit Ethernet interface I am using.


Now, in the past I have used the NFS "async" mount option to help 
with write speed (lots of small files in the case of an untar of a 
set of source files).


However, this does not seem to speed this up in Fedora27 and also I 
don't see the "async" option listed when I run the "mount" command. 
When I use the "sync" option it does show up in the "mount" list.


The question is, is the "async" option actually working with NFS v4 
in Fedora27 ?

___


What server is in use? Is that Linux too? Also, is this v4.0 or v4.1? 
I've copied in some of the NFS team who should be able to assist,


Steve.


Thanks for the reply.

Server is a Fedora27 as well. vers=4.2 the default. Same issue at other 
sites with Fedora27.


Server export: "/data *.kingnet(rw,async,fsid=17)"

Client fstab: "king.kingnet:/data /data nfs async,nocto 0 0"

Client mount: "king.kingnet:/data on /data type nfs4 
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,nocto,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.202.2,local_lock=none,addr=192.168.202.1)"


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Fedora27: NFS v4 terrible write performance, is async working

2018-01-27 Thread Terry Barnaby
When doing a tar -xzf ... of a big source tar on an NFSv4 file system 
the time taken is huge. I am seeing an overall data rate of about 1 
MByte per second across the network interface. If I copy a single large 
file I see a network data rate of about 110 MBytes/sec which is about 
the limit of the Gigabit Ethernet interface I am using.


Now, in the past I have used the NFS "async" mount option to help with 
write speed (lots of small files in the case of an untar of a set of 
source files).


However, this does not seem to speed this up in Fedora27 and also I 
don't see the "async" option listed when I run the "mount" command. When 
I use the "sync" option it does show up in the "mount" list.


The question is, is the "async" option actually working with NFS v4 in 
Fedora27 ?

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


F18: Anaconda installer: Are there going to be any updates to this

2013-01-29 Thread Terry Barnaby
I am trying to build a spin of Fedora18 for local use as I normally do with
Fedora releases using pungi. All has been built fine but the installation
fails with an Anaconda popup stating The following error occurred ... with
no information on the error at all and nothing in the various virtual terminal
windows.

Obviously the Anaconda installer is pretty rough at the moment.

1. Is there a good way to debug it ?
2. Are there likely to be updated releases of Anaconda during F18's life time ?

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F16: Broken ypbind

2012-04-26 Thread Terry Barnaby
On 04/21/2012 08:38 AM, Terry Barnaby wrote:
 On 21/04/12 08:10, Terry Barnaby wrote:
 Some update appears to have broken the operation of ypbind recently.
 I have a F14 sever that serves /home and implements NIS services (ypserv)
 and has been running fine for over a year.

 The F16 clients use NetworkManager with wired Ethernet set to 
 automatic/system
 connection. Everything comes up fine except for ypbind that fails. This
 has worked fine until a week or so ago (fails on multiple clients).

 The boot.log is enclosed, any ideas ?

 Cheers


 Terry


 Enclosed is a portion of /var/log/messages.
 
 The test.sh entries are a script executed from
 /lib/systemd/system/ypbind.service in the pre and post stages.
 
 It looks to me that systemd is starting a lot of network services
 including the NFS mounts and ypbind before the core network interface
 is up and running ...
 
 Is there a mechanism so that systemd will wait until the primary
 network is up before continuing ?  I assumed that ybind waiting
 for network.target would have achieved that ?
 
 
 
 
 
For information: You need to run:
 systemctl enable NetworkManager-wait-online.service

This sets systemd to wait for the network to be up before generating
network.target. Programs then need the network up, such as ypbind, now
wait until the network is actually up.

Why this is not the default I will never know. I guess most developers are
using individual Laptops and not using servers with multiple clients :)
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F16: Kernel bug with USB disks ?

2012-04-26 Thread Terry Barnaby
On 03/29/2012 10:44 AM, Terry Barnaby wrote:
 On 03/28/2012 12:31 PM, Caterpillar wrote:


 2012/3/28 Terry Barnaby ter...@beam.ltd.uk mailto:ter...@beam.ltd.uk

 On 03/26/2012 09:20 PM, J. Randall Owens wrote:
  On 03/26/2012 06:05 AM, Terry Barnaby wrote:
  Hi,
 
  I am using the latest F16 kernel: 3.3.0-4.fc16.i686.PAE and am having
  problems with a MicroSD card connected to a USB card reader. This has
  been working fine until recently (at least in F14 on the same 
 hardware).
 
  The problem is that umount does not appear to be working correctly.
  I have an ext3 file system on the card. I can mount it, and I can copy
  files to it. However when I use the umount ... command it returns
  instantly (should sync the files to the card). The system says the 
 card
  is unmounted (at least it is not listed with mount, df etc).
 
  However if I run sync, there is a lot of disk activity to the card ...
 
  Also if I try and run mkfs it says the device in in use ...
  If I mount a blank card it lists the files present on the previous 
 card ...
 
  This sounds like a nasty kernel bug ...
  Anyone else seen this ?
 
  I thought I'd noticed something like this with 3.2.x kernels also; I
  couldn't narrow it down more than that.  In my case, it's a USB 
 external
  HDD.  After unmounting, I have an old habit of running 3 syncs in one
  line.  And lately, I've noticed that I don't even get that disk 
 activity
  until I give it a second trio of syncs, which certainly doesn't seem 
 right.
  Let me check right now with 3.3.0-4...  Odd, now I do get the activity
  at about the same time as the umount, and no further activity when I
  issue the syncs.  Seems to be the opposite of what you've reported.
 
 Kernel 3.2.10-3.fc16.i686.PAE also appears to be broken.
 This seems really very nasty, does it apply to other disks or just to USB
 ones ...
 I have added Bugzilla bug: 806909 for this.

 https://bugzilla.redhat.com/show_bug.cgi?id=806909

 Cheers


 Terry
 --
 devel mailing list
 devel@lists.fedoraproject.org mailto:devel@lists.fedoraproject.org
 https://admin.fedoraproject.org/mailman/listinfo/devel

 I just commented your bugreport with mine that submitted some months ago


 For people following this, it appears that if the cups printer daemon is
 running then umount fails on USB disks.
 
 How on earth the cups daemon can affect disk data storage unmounts is
 baffling to me. Data storage is sacrosanct, how can the Linux kernel allow
 this to happen ?
 
 Cheers
 
 
 Terry
Just a warning to all, this bug is still present.
If the cupsd is running and you umount a USB disk, then the disk will not
be properly unmounted and any written data will not have been synced.
A pretty major bug that is still there ...

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

F16: Broken ypbind

2012-04-21 Thread Terry Barnaby

Some update appears to have broken the operation of ypbind recently.
I have a F14 sever that serves /home and implements NIS services (ypserv)
and has been running fine for over a year.

The F16 clients use NetworkManager with wired Ethernet set to automatic/system 
connection. Everything comes up fine except for ypbind that fails. This

has worked fine until a week or so ago (fails on multiple clients).

The boot.log is enclosed, any ideas ?

Cheers


Terry
%G%G
Welcome to Fedora release 16 (Verne)!

Starting Collect Read-Ahead Data...
Starting Replay Read-Ahead Data... 
Starting Syslog Kernel Log Buffer Bridge...
Started Syslog Kernel Log Buffer Bridge[  OK  ]
Started Lock Directory [  OK  ]
Started Runtime Directory  [  OK  ]
Starting Media Directory...
Starting RPC Pipe File System...   
Starting Software RAID Monitor Takeover... 
Starting Debug File System...  
Starting Huge Pages File System... 
Starting POSIX Message Queue File System...
Starting Security File System...   
Starting udev Coldplug all Devices...  
Starting udev Kernel Device Manager... 
Started Collect Read-Ahead Data[  OK  ]
Started Replay Read-Ahead Data [  OK  ]
Starting Load legacy module configuration...   
Starting Remount API VFS...

Re: F16: Broken ypbind

2012-04-21 Thread Terry Barnaby

On 21/04/12 08:10, Terry Barnaby wrote:

Some update appears to have broken the operation of ypbind recently.
I have a F14 sever that serves /home and implements NIS services (ypserv)
and has been running fine for over a year.

The F16 clients use NetworkManager with wired Ethernet set to automatic/system
connection. Everything comes up fine except for ypbind that fails. This
has worked fine until a week or so ago (fails on multiple clients).

The boot.log is enclosed, any ideas ?

Cheers


Terry



Enclosed is a portion of /var/log/messages.

The test.sh entries are a script executed from 
/lib/systemd/system/ypbind.service in the pre and post stages.


It looks to me that systemd is starting a lot of network services
including the NFS mounts and ypbind before the core network interface
is up and running ...

Is there a mechanism so that systemd will wait until the primary
network is up before continuing ?  I assumed that ybind waiting
for network.target would have achieved that ?



Apr 21 09:17:55 study NetworkManager[696]:ifcfg-rh: Acquired D-Bus service 
com.redhat.ifcfgrh1
Apr 21 09:17:55 study NetworkManager[696]: info Loaded plugin ifcfg-rh: (c) 
2007 - 2010 Red Hat, Inc.  To report bugs please use the NetworkManager mailing 
list.
Apr 21 09:17:55 study NetworkManager[696]: info Loaded plugin keyfile: (c) 
2007 - 2010 Red Hat, Inc.  To report bugs please use the NetworkManager mailing 
list.
Apr 21 09:17:55 study NetworkManager[696]:ifcfg-rh: parsing 
/etc/sysconfig/network-scripts/ifcfg-Auto_Ethernet ... 
Apr 21 09:17:55 study NetworkManager[696]:ifcfg-rh: read connection 
'New Wired Connection'
Apr 21 09:17:55 study NetworkManager[696]:ifcfg-rh: parsing 
/etc/sysconfig/network-scripts/ifcfg-lo ... 
Apr 21 09:17:55 study NetworkManager[696]: info trying to start the modem 
manager...
Apr 21 09:17:55 study dbus[744]: [system] Activating service 
name='org.freedesktop.ModemManager' (using servicehelper)
Apr 21 09:17:55 study dbus-daemon[744]: dbus[744]: [system] Activating service 
name='org.freedesktop.ModemManager' (using servicehelper)
Apr 21 09:17:56 study NetworkManager[696]: info monitoring kernel firmware 
directory '/lib/firmware'.
Apr 21 09:17:56 study dbus-daemon[744]: dbus[744]: [system] Activating via 
systemd: service name='org.bluez' unit='dbus-org.bluez.service'
Apr 21 09:17:56 study dbus[744]: [system] Activating via systemd: service 
name='org.bluez' unit='dbus-org.bluez.service'
Apr 21 09:17:56 study dbus-daemon[744]: modem-manager[793]: info  
ModemManager (version 0.4.998-1.git20110706.fc16) starting...
Apr 21 09:17:56 study modem-manager[793]: info  ModemManager (version 
0.4.998-1.git20110706.fc16) starting...
Apr 21 09:17:56 study dbus[744]: [system] Activation via systemd failed for 
unit 'dbus-org.bluez.service': Unit dbus-org.bluez.service failed to load: No 
such file or directory. See system logs and 'systemctl status 
dbus-org.bluez.service' for details.
Apr 21 09:17:56 study dbus-daemon[744]: dbus[744]: [system] Activation via 
systemd failed for unit 'dbus-org.bluez.service': Unit dbus-org.bluez.service 
failed to load: No such file or directory. See system logs and 'systemctl 
status dbus-org.bluez.service' for details.
Apr 21 09:17:56 study iscsid: iSCSI logger with pid=799 started!
Apr 21 09:17:56 study NetworkManager[696]: info WiFi enabled by radio 
killswitch; enabled by state file
Apr 21 09:17:56 study NetworkManager[696]: info WWAN enabled by radio 
killswitch; enabled by state file
Apr 21 09:17:56 study NetworkManager[696]: info WiMAX enabled by radio 
killswitch; enabled by state file
Apr 21 09:17:56 study NetworkManager[696]: info Networking is enabled by 
state file
Apr 21 09:17:56 study NetworkManager[696]: warn failed to allocate link 
cache: (-12) Netlink Error (errno = Operation not supported)
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): carrier is OFF
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): new Ethernet device 
(driver: 'r8169' ifindex: 2)
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): exported as 
/org/freedesktop/NetworkManager/Devices/0
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): now managed
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): device state change: 
unmanaged - unavailable (reason 'managed') [10 20 2]
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): bringing up device.
Apr 21 09:17:56 study kernel: [   22.794871] r8169 :02:05.0: p3p1: link down
Apr 21 09:17:56 study kernel: [   22.794884] r8169 :02:05.0: p3p1: link down
Apr 21 09:17:56 study dbus[744]: [system] Successfully activated service 
'org.freedesktop.ModemManager'
Apr 21 09:17:56 study dbus-daemon[744]: dbus[744]: [system] Successfully 
activated service 'org.freedesktop.ModemManager'
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): preparing device.
Apr 21 09:17:56 study NetworkManager[696]: info (p3p1): deactivating device 
(reason 'managed') [2]
Apr 21 09:17:56 study modem

Re: F16: Kernel bug with USB disks ?

2012-03-29 Thread Terry Barnaby
On 03/28/2012 12:31 PM, Caterpillar wrote:
 
 
 2012/3/28 Terry Barnaby ter...@beam.ltd.uk mailto:ter...@beam.ltd.uk
 
 On 03/26/2012 09:20 PM, J. Randall Owens wrote:
  On 03/26/2012 06:05 AM, Terry Barnaby wrote:
  Hi,
 
  I am using the latest F16 kernel: 3.3.0-4.fc16.i686.PAE and am having
  problems with a MicroSD card connected to a USB card reader. This has
  been working fine until recently (at least in F14 on the same 
 hardware).
 
  The problem is that umount does not appear to be working correctly.
  I have an ext3 file system on the card. I can mount it, and I can copy
  files to it. However when I use the umount ... command it returns
  instantly (should sync the files to the card). The system says the card
  is unmounted (at least it is not listed with mount, df etc).
 
  However if I run sync, there is a lot of disk activity to the card ...
 
  Also if I try and run mkfs it says the device in in use ...
  If I mount a blank card it lists the files present on the previous 
 card ...
 
  This sounds like a nasty kernel bug ...
  Anyone else seen this ?
 
  I thought I'd noticed something like this with 3.2.x kernels also; I
  couldn't narrow it down more than that.  In my case, it's a USB external
  HDD.  After unmounting, I have an old habit of running 3 syncs in one
  line.  And lately, I've noticed that I don't even get that disk activity
  until I give it a second trio of syncs, which certainly doesn't seem 
 right.
  Let me check right now with 3.3.0-4...  Odd, now I do get the activity
  at about the same time as the umount, and no further activity when I
  issue the syncs.  Seems to be the opposite of what you've reported.
 
 Kernel 3.2.10-3.fc16.i686.PAE also appears to be broken.
 This seems really very nasty, does it apply to other disks or just to USB
 ones ...
 I have added Bugzilla bug: 806909 for this.
 
 https://bugzilla.redhat.com/show_bug.cgi?id=806909
 
 Cheers
 
 
 Terry
 --
 devel mailing list
 devel@lists.fedoraproject.org mailto:devel@lists.fedoraproject.org
 https://admin.fedoraproject.org/mailman/listinfo/devel
 
 I just commented your bugreport with mine that submitted some months ago
 
 
For people following this, it appears that if the cups printer daemon is
running then umount fails on USB disks.

How on earth the cups daemon can affect disk data storage unmounts is
baffling to me. Data storage is sacrosanct, how can the Linux kernel allow
this to happen ?

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

F16: Kernel bug with USB disks ?

2012-03-26 Thread Terry Barnaby
Hi,

I am using the latest F16 kernel: 3.3.0-4.fc16.i686.PAE and am having
problems with a MicroSD card connected to a USB card reader. This has
been working fine until recently (at least in F14 on the same hardware).

The problem is that umount does not appear to be working correctly.
I have an ext3 file system on the card. I can mount it, and I can copy
files to it. However when I use the umount ... command it returns
instantly (should sync the files to the card). The system says the card
is unmounted (at least it is not listed with mount, df etc).

However if I run sync, there is a lot of disk activity to the card ...

Also if I try and run mkfs it says the device in in use ...
If I mount a blank card it lists the files present on the previous card ...

This sounds like a nasty kernel bug ...
Anyone else seen this ?

Cheers



Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F16: Kernel bug with USB disks ?

2012-03-26 Thread Terry Barnaby
On 03/26/2012 02:05 PM, Terry Barnaby wrote:
 Hi,
 
 I am using the latest F16 kernel: 3.3.0-4.fc16.i686.PAE and am having
 problems with a MicroSD card connected to a USB card reader. This has
 been working fine until recently (at least in F14 on the same hardware).
 
 The problem is that umount does not appear to be working correctly.
 I have an ext3 file system on the card. I can mount it, and I can copy
 files to it. However when I use the umount ... command it returns
 instantly (should sync the files to the card). The system says the card
 is unmounted (at least it is not listed with mount, df etc).
 
 However if I run sync, there is a lot of disk activity to the card ...
 
 Also if I try and run mkfs it says the device in in use ...
 If I mount a blank card it lists the files present on the previous card ...
 
 This sounds like a nasty kernel bug ...
 Anyone else seen this ?
 
 Cheers
 
 
 
 Terry

Looks like kernel 3.2.9-2.fc16.i686.PAE is ok ...

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Graphics test week: Where is gfx_test_week_20100414_i686.iso

2010-04-16 Thread Terry Barnaby
I was going to try and participate in the Graphics test week
for ATI/Intel boards. However I can't download either:

http://jlaska.fedorapeople.org/2010-04-13-nouveau-testday/gfx_test_week_20100414_i686.iso

or

http://adamwill.fedorapeople.org/gfx_test_week_20100412/gfx_test_week_20100412_x86-64.iso

due to server timeouts.
Is there anywhere else i can get these from ?

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Graphics test week: Where is gfx_test_week_20100414_i686.iso

2010-04-16 Thread Terry Barnaby
On 16/04/10 16:43, Adam Williamson wrote:
 On Fri, 2010-04-16 at 09:05 +0100, Terry Barnaby wrote:
 I was going to try and participate in the Graphics test week
 for ATI/Intel boards. However I can't download either:

 http://jlaska.fedorapeople.org/2010-04-13-nouveau-testday/gfx_test_week_20100414_i686.iso

 or

 http://adamwill.fedorapeople.org/gfx_test_week_20100412/gfx_test_week_20100412_x86-64.iso

 due to server timeouts.
 Is there anywhere else i can get these from ?

 No...but the links are working fine for me at this time. If they still
 don't work for you, you could do the testing with a nightly build, now
 (all the important packages should be there). Thanks!
Thanks,

The servers seem to be up again now (they where down at least 7 hours ...)

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Update testing policy: how to use Bodhi

2010-03-27 Thread Terry Barnaby
On 27/03/10 04:12, Adam Williamson wrote:
 On Sat, 2010-03-27 at 03:50 +0100, Kevin Kofler wrote:

 So I don't think blocking an update outright for having received type 2
 feedback is sane at all.

 Sigh. That was why I said not to sidetrack the discussion because it was
 the least important bit of the post. It was just an example of how
 easily the policy can be adapted. I'm really not interested in thrashing
 out the tiny details of *that* in *this* thread, that is not what it's
 for. I had a whole paragraph about the possibility of an override
 mechanism for maintainers which I left out precisely in order to avoid
 this kind of discussion, but apparently that wasn't enough...

I'm not sure if your usage policy covers changes to Bodhi, but how about
the system emailing the upstream developers (direct and/or email lists)
when a release is made available for testing/release and also on any
problems found ?

Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal

2010-03-26 Thread Terry Barnaby
 
 This isn't really true. Or at least, not a productive perspective for
 improving Fedora. There are certainly a lot of bugs in the open drivers
 shipped with Fedora, and there's a lot of work to do to fix them all. I
 sent my own reply to Terry explaining how we're handling this at
 present, but just waving the 'it's all NVIDIA's fault' stick at the
 problem won't make it go away :)

One little point I noticed recently. I was watching and testing
a package released by Fedora. I was having some email conversations with
some of the upstream developers of that package at the time.

From what I noticed, there did not seem to be much communication between
the Fedora packagers and the upstream developers. Particularly
the upstream developers appeared to be unaware that their code had been packaged
for Fedora/Redhat etc. In fact they were looking at creating
a binary release and looking at how to test it on various platforms.

I don't know if this is an isolated case, but if it is common maybe:

1. The upstream developers email addresses or mailing list could be
   added to an email list on the package in the FedoraBuild system.
2. When a package is submitted for testing then an email and possibly
  other communications is made between the packager and the upstream
  developers with the suggestion that the package has been built for
  testing.
3. All RPM fixes required are emailed back to upstream developers.

It seems to me that it would be good if the upstream developers knew
of the package and were encouraged to test it. They are most likely
to find any initial obvious issues.

Cheers


Terry

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal

2010-03-19 Thread Terry Barnaby
On 03/18/2010 09:00 PM, Adam Williamson wrote:
 .
 That's somewhat optimistic; no matter how much testing we do, we can
 only afford a certain amount of full-time developer muscle. The testing
 has helped to improve efficiency and direction of graphics development
 work, I think, but there's fundamentally a lot of work to do and only a
 limited amount of manpower to do it with.
 
 The X devs are always very happy to take new volunteers. :)

Although I have done a fair amount of X development work in the past,
unfortunately, as with most people, my volunteer time is in short
supply (I have a wife and kids to support :) ). So in use testing,
bug reporting and attempting the first pass bug searches is currently my
limit in Fedora.

If there were less churn in Fedora to cope with perhaps I and other
user/developers would have more unpaid volunteer time to devote to
actual development :)
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal

2010-03-18 Thread Terry Barnaby
On 03/18/2010 12:58 PM, Rahul Sundaram wrote:
 On 03/18/2010 06:27 PM, Terry Barnaby wrote:

 As part of a Fedora release I would have thought that part of the
 test schedule would be to load and view some test documents with
 openoffice on a set number of platforms (Different graphics chipsets).
 This would likely have picked this up ...
   
 
 With the current list of major new features, the test days don't cover
 anything like this.  If you are interested and want to volunteer, that
 would be very helpful.
 
 Rahul

Although I am willing to help test, and have taken part in Fedora testing
days, I personally feel that the current apparent climate in Fedora
(frequent releases, pushing new features fast and perhaps now pushing
updates more quickly) will make testing difficult and stability difficult
to achieve and very hard work. I'm not sure I want to use significant
amounts of my time battling that 

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal

2010-03-18 Thread Terry Barnaby
On 18/03/10 19:18, Adam Williamson wrote:
 On Thu, 2010-03-18 at 12:57 +, Terry Barnaby wrote:

 As part of a Fedora release I would have thought that part of the
 test schedule would be to load and view some test documents with
 openoffice on a set number of platforms (Different graphics chipsets).
 This would likely have picked this up ...

 Ahh. Idealism. :)

 No, we are nowhere near this. We only implemented any kind of desktop
 testing for the F13 cycle. Prior to F13, there was no planned validation
 testing of anything besides the installer.

 More generally, we consider it acceptable to release Fedora with known
 regressions in functionality where this is ultimately required to drive
 development forward. QA and devel groups know very well that some
 AMD/ATI adapters behave worse in F10 onwards than they did in F9 and
 earlier (though they tend to be better in F13 than F12, and better in
 F12 than F11). This is due to extensive architecture changes in the
 driver which were considered necessary to support new and future
 hardware properly, and implement new models like Gallium.

Although I understand Fedora's frontier status, I think the graphics
system changes could probably have been handled better. After the kernel and
core shared libraries the graphics system is probably the next essential
core OS subsystem (At least for desktop systems). It seems most of peoples
stability issues with fedora stem from graphics. I do understand the
difficulty with the multitude of different graphics chipsets out there. But
this is where Fedora could shine with its close links to upstream development. 
It would have been good to be very upfront with this and get a group to
define and setup some basic graphics tests and loudly promote users to
perform tests with these both pre-release and post-release. This with a
website with test status versus graphics board/chipsets and with good
easy linkages to Bugzilla (more user friendly) and perhaps a separate
graphics-testing repository to keep quick graphics updates away from
the stable release etc. If enough upstream developers, Fedora packagers
and testing users were in on this I think great inroads into getting stable
and good graphics systems would be made in a relatively short time.

Some more Idealism :)
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Problems in packaging kicad

2010-03-17 Thread Terry Barnaby
On 03/16/2010 08:19 PM, Tom spot Callaway wrote:
 On 03/16/2010 08:27 AM, Alain Portal wrote:
 Hi,

 I get some problem in packaging kicad.
 Linking fails, and I don't know how to solve.
 There is no problem if I compile without packaging.
 
 Well, I doubt that seriously. This code is a bit of a mess (but still
 not as painful as chromium).
 
 Attached is a new spec file, and two patches that fix the problems which
 were preventing it from building.
 
 ~spot, who feels dirty after spending so much time hacking around cmake
 
Another little change that is needed is that the package source file
kicad-ld.conf has the contents /usr/lib64/kicad. It should be
/usr/lib/kicad, otherwise it does not work correctly on 32bit
i686 platforms.

The kicad.spec file edits this to /usr/lib64/kicad if for an x86_64
platform.

Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal

2010-03-14 Thread Terry Barnaby
On 14/03/10 16:29, Kevin Kofler wrote:
 Terry Barnaby wrote:
 Last night I was helping some school kids with a powerpoint presentation
 they had written. They had, not un-reasoanbly, used FontWork titles.
 Under F12 with OpenOffice 3.1 it took 3 minutes to load the 10 slide file
 and about 1 minute between slides. Editing was basically impossible.
 I had to go back to a F8 system OpenOffice 2.x system to edit the FontWork
 out. How on earth did this release get out (Fedora/OpenOffice ??) ??
 This is just one of many basic issues I and others have had with Fedora
 recently.

 OO.o 3.1 wasn't even an update, F12 shipped with it, so this is completely
 off topic here.

  Kevin Kofler

Yes, I guess this paragraph of my email, is a bit of topic, but I think
still related. It is saying that even Fedora releases appear to have too
little testing prior to release. The move to have less tested updates as
well is moving even more in this direction in my eyes...
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal (was Re: Fedora Board Meeting Recap 2010-03-11)

2010-03-12 Thread Terry Barnaby
On 12/03/10 03:42, Kevin Kofler wrote:
 Chris Adams wrote:
 There's a difference between not supporting third-party software (is
 that actually documented somewhere or another Kevin Kofler rule?) and
 intentionally breaking it.

 There's no policy saying we support it, ergo by default, we don't.

 And we don't intentionally break it, we upgrade a library for some good
 reason (there's always a good reason why a soname bump gets pushed) and that
 happens to break some third-party software we don't and can't know about.
 (When we do, e.g. for software in RPM Fusion, we alert the affected
 maintainers so they can rebuild their packages.)

 For example, Firefox security updates are impossible to do without ABI
 breaks in xulrunner.

  Kevin Kofler

I really strongly disagree that ABI interfaces of the mainly used
shared libraries could be allowed to change in a stable release.
We develop internal applications that are packaged and go out to a few
users. We use Fedora primarily as an OS to run applications we need
rather than an experimentation platform.
I consider it unacceptable for a system update to break the
ABI for these and any other third-party packages. It would mean failures
in the field that would require live intervention. This is what
rawhide is for.
We would end up by turning off Fedora updating on these systems and in
effect manage the updates of the system ourselves probably from our own
repository (our own Fedora spin) or, probably move to a different system.
I am sure a lot of users, like us, use Fedora for there own purposes and
develop there own applications for it, but do not maintain them in the
main Fedora package tree. There's more to Fedora than just the main Fedora
repository...

Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Stable Release Updates types proposal

2010-03-12 Thread Terry Barnaby
On 13/03/10 01:45, Kevin Kofler wrote:
 Al Dunsmuir wrote:
 And  turning  all  releases  into  rolling  branches helps keep things
 sane?

 Please call a spade a spade. Without a reduction in churn in the stable
 releases that is what they become and remain until EOL - a rolling branch.

 No, a semi-rolling branch, as in a branch which picks up rolling feature
 upgrades WITHOUT the disruption and breakage inherent to a rolling branch,
 which instead only happens from one release to another. In other words, the
 best of both worlds (release-based and rolling).

  Kevin Kofler

It is generally new features that break things. The core part of a stable 
release is that it has a known set of features that users and developers can
base their work on. This allows users to use the system and progress with
their own work. Some people still predominately use Fedora rather than work
on it.

The Fedora stable system has become more and more unstable I believe
due to the move to more frontier new features and quicker almost untested 
updates. For an individual user most new features are of no benefit at all,
some times they are a disadvantage.

Last night I was helping some school kids with a powerpoint presentation
they had written. They had, not un-reasoanbly, used FontWork titles.
Under F12 with OpenOffice 3.1 it took 3 minutes to load the 10 slide file
and about 1 minute between slides. Editing was basically impossible.
I had to go back to a F8 system OpenOffice 2.x system to edit the FontWork
out. How on earth did this release get out (Fedora/OpenOffice ??) ??
This is just one of many basic issues I and others have had with Fedora
recently.

Fedora has to have some actual users otherwise there is no real testing
done and no real developer/user communication.

I do think Rawhide is the place for real frontier work, but it seems to be
moving more and more into stable. Maybe that is because more front end
developers are working with Fedora and less users ?

Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: PROPOSAL: Fedora user survey

2010-03-10 Thread Terry Barnaby
On 03/09/2010 06:20 PM, Ewan Mac Mahon wrote:
 On Tue, Mar 09, 2010 at 05:55:33PM +, Terry Barnaby wrote:
 On 09/03/10 16:50, Ewan Mac Mahon wrote:
 On Tue, Mar 09, 2010 at 09:33:45AM -0500, Al Dunsmuir wrote:

 I  have  limited  time to do system installs and maintenance. Sticking
 with  one  distribution  helps keep that sane. I have a dual boot XP +
 Ubuntu machine that I do some play with, but I find it strange, having
 used  Fedora  since  FC3.

 You should consider running a RHEL rebuild like Scientific Linux or
 CentOS then; they're very Fedora-like in most respects, and are
 supported for very, very long periods.

 The trouble is that Fedora does not want to lose this user or group of
 users.

 I don't think it's about the users so much as the uses; I run Fedora on
 machines it's suitable for, and SL where that's a better fit. I am not
 lost to Fedora. Equally Al Dunsmuir clearly also has multiple systems
 (hence wanting to keep one distribution); so there's no reason to think
 they couldn't do the same.
 
 If all of the users that use Fedora for reasonably important
 tasks (not mission critical) stop using it, it will lose all of the
 real use testing and feedback to upstream developers that Fedora is
 good for and Linux needs.

 I use Fedora for important things, both by home and work desktops run
 it, I just don't use it for things that I expect to run for a long time
 without interruption or intervention or alteration.
 
 My feeling is that is likely to be happening ...

 I think Fedora should have a carefully balanced position between the
 mission critical systems and completely play only systems.

 I think that's a false dichotomy; Fedora shouldn't be a bad system
 unsuitable for real work, it should be a good system. But it can still
 be a good fast-moving system. There doesn't seem much purpose in trying
 to turn it into a good stable long running server system, when we have a
 perfectly good one of those already. 
 
 Clearly, if there was a way to have Fedora's 'freshness', but RHEL's
 lack of hassle that would be ideal, but it's not clear that there is.
 
 Ewan
I agree with you, except that I think Fedora's balance has moved a bit
to far in the fast-moving, frontier direction. I also use Fedora
for home servers and workstations at work and have done so for many years.
I also use RHEL/CentOS for stable mission critical systems.
However I am finding that Fedora is getting to unstable, too frontier
and hard to maintain even for the purposes I use it. My home customers
(the wife and kids :) ) are now starting to complain!
I can see users like myself moving to Ubuntu or other systems. If this
is the case Fedora will lose some of it's testing community and probably
the best testing group.

Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: PROPOSAL: Fedora user survey

2010-03-09 Thread Terry Barnaby
On 09/03/10 15:49, Matt Domsch wrote:
 On Tue, Mar 09, 2010 at 03:30:37PM +, Terry Barnaby wrote:
 I personally thought the old 12 month RedHat Linux release cycle was about
 right.

 RHL was also on a 6 month release cycle.

Shows how good my memory is :)
Mind you, even then, I only update my systems every year generally,
so I probably missed every other release then as I do now.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: PROPOSAL: Fedora user survey

2010-03-09 Thread Terry Barnaby
On 09/03/10 15:26, Seth Vidal wrote:


 On Tue, 9 Mar 2010, Konstantin Ryabitsev wrote:

 On Tue, Mar 9, 2010 at 8:51 AM, Seth Vidalskvi...@fedoraproject.org  wrote:
 Here's the camps I see:

 1. One group wants us to aim for mom/pop/grandma/desktop users - the
 apple market or what ubuntu aims for.

 2. one group wants us to aim exclusively for the bleeding edge open
 source developer market.

 3. one group wants us to aim for the admin/experienced user who wants
 newer things but doesn't have time nor interest to fight with lots of 
 broken things.

 4. one group who don't really care about distro wars but use Fedora
 because this way they know what will be in RHEL/CentOS, which is what
 they use for serious work on their servers.

 I think it's a fairly large but mostly quiet group, actually. :)


 Agreed.

 -sv

I guess I fall in group 4. However, Fedora is getting so buggy for me and
now requires so much maintenance on a day to day basis that I for one
am getting more vocal :)

For the first time in decades I am having to reboot systems due to kernel
panics or graphics system lockups etc etc.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: PROPOSAL: Fedora user survey

2010-03-09 Thread Terry Barnaby
On 09/03/10 16:50, Ewan Mac Mahon wrote:
 On Tue, Mar 09, 2010 at 09:33:45AM -0500, Al Dunsmuir wrote:
 Hello Seth,

 Tuesday, March 9, 2010, 9:23:00 AM, you wrote:

 Your primary server runs fedora? May I ask why?
 -sv

 I  have  limited  time to do system installs and maintenance. Sticking
 with  one  distribution  helps keep that sane. I have a dual boot XP +
 Ubuntu machine that I do some play with, but I find it strange, having
 used  Fedora  since  FC3.

 You should consider running a RHEL rebuild like Scientific Linux or
 CentOS then; they're very Fedora-like in most respects, and are
 supported for very, very long periods.

 Ewan
The trouble is that Fedora does not want to lose this user or group of
users. If all of the users that use Fedora for reasonably important
tasks (not mission critical) stop using it, it will lose all of the
real use testing and feedback to upstream developers that Fedora is
good for and Linux needs.

My feeling is that is likely to be happening ...

I think Fedora should have a carefully balanced position between the
mission critical systems and completely play only systems.

Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: Proposed udpates policy change

2010-03-08 Thread Terry Barnaby
On 08/03/10 23:12, Matthew Garrett wrote:
 On Mon, Mar 08, 2010 at 11:21:45PM +0100, Sven Lankes wrote:

 If Fesco is aiming at getting rid of all the pesky packagers maintaining low
 profile packages: You're well on your way.

 So, no, that's not the intent and it's realised that this is a problem.
 We need to work on making it easier for users to see that there are
 available testing updates and give feedback on them. This is clearly
 going to take a while, and there'd undoubtedly going to be some
 difficulty in getting updates for more niche packages through as a
 result. If people have further suggestions for how we can increase the
 testing base then that would be awesome, but the status quo really
 doesn't seem sustainable.


Can I make a tangential suggestion here ?
I am on the side that quite a few packages need much more testing before
they appear in stable. Fedora, for me, has got less and less stable
and thus less usable. I have been bitten for many years, especially by
the Graphics issues :(

However, to get testing done it actually requires a good many users to
actually test the packages. This requires it to be easy for the user to do.
This is especially true for the Graphics system where testing is needed
on lots of different hardware.

I wonder if a change to the Fedora package build/release systems could
help here by more closely coupling Bodhi/Others with updates-testing.
Some ideas:

1. All packages in Bodhi automatically appear in updates-testing.
2. The karma (or stability :) ) value is stored in the RPM.
3. A link to the Bodhi information is also stored in the RPM.
4. yum/rpm is modified to support the idea of a
karma/package stability level.
5. Users can change the package stability level they are willing to accept
overall or on a per package or package group basis.
6. Simple GUI package manager extensions could handle the extra info or a simple
to use package-testing GUI could be developed to handle this allowing 
users
to feedback issues in an easy way.
7. There would be an easy way for users to backout the duff packages.
(emails back to the user when an updated package is available ?)
8. A link with upstream would be provided so any user generated feedback
would be sent to the upstream developers or be easily available by
them. Bugzilla would be integrated into this.
9. Bodhi information would include info on particular items to test.

This would mean that all packages are immediately available to end users
in an easy to use way. Users could decide how frontier they would like their 
system. Links to the Bodhi information would directly available to users 
allowing feedback of issues and backout the packages. It would also
be nice to have package groups such as
Graphics: kernel,libdrm,mesa,xorg-x11-* etc) so that an entire group of
related packages could be karma'ed, installed and reverted in an easy way.
In the background the system can check for package dependency issues and
notify the package managers automatically. Obviously how the kama/package 
stability level is calculated is an issue. In fact updates-testing could
probably go and we would just have updates ...

Any views, anyone got some financial resources :)

Terry

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


F12: lirc-0.8.6-4.fc12 missing from updates testing ?

2010-02-27 Thread Terry Barnaby
Hi,

In bugzilla 564095 there is mention of an lirc update, lirc-0.8.6-4.fc12,
that fixes an issue with lirc on 2.6.32 kernels that has been
submitted to updates-testing.
This package does not seem to exist there and the link to its bodhi entry
from the bugzilla page links to an entry for bind ...

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: cpp and backslashes

2010-01-12 Thread Terry Barnaby
On 01/12/2010 09:16 AM, Jakub Jelinek wrote:
 Yes, the tokens are separated by whitespace, so it is sufficient if they are
 again separated by whitespace after preprocessing.  See
 http://gcc.gnu.org/PR41445 for details why this changed, in short
 without the change the tokens have incorrect location and cause unintended
 differences in debug info between direct compilation and compilation where
 preprocessing happens separately from compilation.
 You can use cpp -P to avoid this (then the output won't be cluttered with
 # line filename
 either).
 
Thanks for the info.
I will feed this back to the OpenFOAM developers.

Cheers


Terry
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel