Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-09-14 Thread Steffen Möller
On 09/13/2011 05:08 PM, Diggory Hardy wrote:

> Short version of the story is that increasing the checkpoint interval (even
> quite drastically since checkpoints are only useful should the application
> crash in a non-deterministic fashion or be killed unexpectedly) should help,
> but I don't know what else can be done.

Checkpointing has been ruled out - at least the user configurable part of it.
My machine is on somewhat close to 24/7 and as such does not need any 
checkpointing.
I am hence checkpointing every 100 seconds or so.

Steffen



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-09-13 Thread Diggory Hardy
Just to chip in, I've had a few complaints about this as one of the guys
behind malariacontrol.net.

The basic formula for the volume of data being saved to disk per hour is
s*f*n where s is the size of a checkpoint, f is the frequency (num written
per hour), and n is the number of jobs active in parallel. Obviously* *keeping
*s* small is the job of the project (we try, but there is a limit),* n* is
up to the user, but *f* is specified by the client/user and AFAICT the
client default cannot be specified by the project (we had one guy complain
that several computers using 1000MBit/s NAS were totally choking the
network).

Short version of the story is that increasing the checkpoint interval (even
quite drastically since checkpoints are only useful should the application
crash in a non-deterministic fashion or be killed unexpectedly) should help,
but I don't know what else can be done.

-Diggory


Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-08-01 Thread franckr
Hi Kevin,

You probably posted as same time as me above.

3 points I disagree with you:

1. Clean Energy project system requirements do NOT state the high disk transfer.
It just mention the large disk space needed (2Gb).
(so nothing to see with the up to 2.9Gb/hr I found)

Ref: When selecting projects in "My Projects" 
(https://secure.worldcommunitygrid.org/ms/viewMyProjects.do)
there is a link near Clean Energy project stating "Please review the system 
requirements before opting to participate in this project"
I stored this link in system_requirements.txt
I don't see the disk transfer in it.


2. Boinc should control (1) your project and (2) where the data are saved
See my <3> in previous message.


3. I never had 3 workunits running in parallel.
I don't know why above shows like that (3 "in progress" = another bug ?, or 
because I aborted units ?), but I only have one dual core, so max 2 units runs 
in parallel.
But you are right, I don't remember having seen 2 workunits from Clean Energy 
project running at same time.

I do agree that further discussions specific to WCG should happen on WCG Forum.

Thanks, 
Franck
System Requirements

What are the recommended minimum system specifications?
How much data will I download and upload while participating with World 
Community Grid?
How do I know if my computer is running the 64-bit research application?


What are the recommended minimum system specifications?
In order to participate in World Community Grid, you will need to have at least 
the following:

The ability to display graphics (if you wish to see the graphics)
An Internet connection

In addition, each research project has its own requirement for memory and disk 
space. These are as follows:

Research ProjectMemory AvailableDisk Space  Operating 
Systems
Computing for Clean Water   400 MB  100 MB  Windows1,5, Mac, Linux3,5
The Clean Energy Project - Phase 2  1,024 MB2,048 MB
Windows1, Mac, Linux3,4
Discovering Dengue Drugs - Together - Phase 2   1,024 MB250 MB  
Windows1, Mac, Linux3
Help Cure Muscular Dystrophy - Phase 2  64 MB   50 MB   Windows1, Mac2, Linux3
Help Fight Childhood Cancer 250 MB  100 MB  Windows1, Mac, Linux3
Help Conquer Cancer 250 MB  50 MB   Windows1, Mac, Linux3
Human Proteome Folding - Phase 2250 MB  100 MB  Windows1, Mac2, Linux3
FightAIDS@Home  250 MB  100 MB  Windows1, Mac, Linux3

1. World Community Grid supports the following Windows platforms: Windows XP, 
Vista, 7, Server 2003, Server 2008
2. For these projects, only Mac computers using Intel processors are supported
3. Only Linux on x86 and x86-64 is supported
4. Users who choose to run this project are encouraged to set the 'Leave 
applications in memory while suspended' option in their device profile
5. 64-bit application available

Return to Top 

How much data will I download and upload while participating with World 
Community Grid?
The amount of data that you transfer depends upon how your processing 
preferences are set to run, how powerful your computer is and how often your 
computer is on. It also varies based upon which research projects you run on 
your computer. An average computer contributing to World Community Grid returns 
about 2 results per day.

Each of the research projects at World Community Grid uses a different 
application, input files and output files. As a result, the size used for each 
of these varies by project. This is outlined on the chart below. Please note 
that the data is compressed during transfer and is decompressed after it has 
been downloaded. As a result it will occupy more space on disk than the numbers 
shown below.

Research ProjectOne-Time Download   Per Workunit Download   Per 
Workunit Upload
Computing for Clean Water   10 MB   0.001 MB0.5 MB
The Clean Energy Project - Phase 2  200 MB  0.1 MB  20 - 80 MB
Discovering Dengue Drugs - Together - Phase 2   10 MB   1-20 MB 1 MB
Help Cure Muscular Dystrophy - Phase 2  0.7 MB  0.1 MB  0.2 MB
Help Fight Childhood Cancer 1.0 MB  0.1 MB  0.1 MB
Help Conquer Cancer 1.0 MB  0.1 MB  0.1 MB
Human Proteome Folding - Phase 212.0 MB 1.5 MB  0.1 MB
FightAIDS@Home  1.5 MB  0.1 MB  0.1 MB


Return to Top 

How do I know if my computer is running the 64-bit research application?
On a Windows machine, you can use the Windows task manager to view the process 
name. 64-bit research applications will end with "windows_intelx86_64", while 
32-bit applications will end with "windows_intelx86"



On a Linux machine, you can find the PID of the research application (which 
will start with the name "wcg") and then execute the command "file -L 
/proc//exe"

There is no 64-bit research application for Mac machines at this time.


Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-08-01 Thread Kevin Reed


BOINC is not causing the heavy IO - it is the research application 'The
Clean Energy Project' from World Community Grid.

That  research application uses very heavy disk IO.  This is why when a
user signs up to participate at World Community Grid, 'The Clean Energy
Project' it is not selected by default and there is a warning  that says
"Please review the system requirements before opting to participate in this
project ". There is a link to the system requirements that states that 2GB
of disk space is required.

Additionally, if a user still proceeds to select the project, they will be
limited to only have one workunit for the project at a time downloaded by
the BOINC client.  In the information above, the user has three workunits
downloaded.  This means that they took further action to change the default
behavior through an option that is buried pretty deeply in preferences.

Further discussion of this issue is most appropriately handled on the World
Community Grid forums for this research application at
http://www.worldcommunitygrid.org/forums/wcg/listthreads?forum=480

thanks,
Kevin Reed
World Community Grid Technical Support

Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-08-01 Thread franckr
Hi Steffen,

Thanks for your input.
As on your system you have low disk IO, I reinstalled boinc (this time manually 
on HDD) and checked one application at a time from project "Wold Community 
Grid".

<1> Conditions:
   IO throughput for 1 task running at a time (if 2 runnings at same time => I 
divided by 2 the write IO/hr)
   On a dual core Core2@3.2Ghz (faster CPU/more cores will have higher write 
IO).
   Set "Task check point to disk at most every" 6s

<2> Findings:
It is application dependant:

- Nice applications (write small amount)
  (ca. 200kb/min) Help Conquer Cancer
  (ca. 800kb/min) FightAIDS@Home

- Guilty application:
   The Clean Energy Project - Phase 2
   This one write  *** 2 to 2.9 Gb/hr *** ! (again only 1 running on one core 
on an old CPU)
   And nearly no read => that's a waste of participant's ressources and 
computer.

   'System Requirements' announced on 'World Community Grid' says
  Memory Available 1,024 MB
  Disk Space 2,048 MB
   But nowhere it is indicated that it is so write intensive !

   Their might be other guilty applications, but I cannot check them all. 
Furthermore it may be task dependant too.


<3> Conclusion
I think Boinc should control applications, and allow the user to set 'max 
writing', as it todays allow 'max cpu', 'max memory'.

Thus suggested potential solutions:
a- Applications to announce estimated write I/O (read I/O too?)
b- Boinc allow user to limit max I/O per tasks (and if exceed, stop task & use 
other applications). When user set, can compare to Application announcement to 
avoid starting with limit < estimation.
c- install by default in /home (ex: /home/boinc) not in /var/lib/boinc-client. 
This will increase the chances to land on a HDD and not on a SSD.


<4> details 'Clean Energy Project - Phase 2' IO
Conditions:
   - $ time iotop -u boinc -a
  might need to install 'aptitude install iotop'
   - check that Process ID are the same before and after measurements. Because 
some processes disappear and new ones appear (during same task!); but iotop 
only shows current running Processes, not anymore disappeared one => one can 
miss some IO.
   - with Project 'World Community Grid'
  Application 'The Clean Energy Project - Phase 2 6.40'
  Starting task E202831_920_C.27.C21H11NOS3Si.00590497.3.set1d06_1 using 
cep2 version 640
   - Set "Task check point to disk at most every" 6s

Result1:
  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND  

 5328 idle boinc 0.00 B   1180.60 M  0.00 %  0.01 % 
../../projects/www.worldcommunitygrid.o~86.svp.n.opt 2378613406239751723 41381 0
 5151 idle boinc 0.00 B140.00 K  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~chem_prod_linux.x86 -float 1 -stop 43200
 5153 idle boinc 0.00 B 32.00 K  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~chem_prod_linux.x86 -float 1 -stop 43200
 5330 idle boinc 0.00 B112.00 K  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~86.svp.n.opt 2378613406239751723 41381 0
 1951 idle boinc 0.00 B 84.00 K  0.00 %  0.00 % boinc 
--check_all_logins --redirectio --dir /var/lib/boinc-client
 5152 idle boinc 0.00 B  0.00 B  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~chem_prod_linux.x86 -float 1 -stop 43200
 5329 idle boinc 0.00 B  0.00 B  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~86.svp.n.opt 2378613406239751723 41381 0
 real   26m45.474s
 user   0m10.609s
 sys0m2.836s

1180.6/(26*60+455)*3600/1024 = 2.06Gb/hr

Result2:
  TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND  

 5328 idle boinc 0.00 B829.65 M  0.00 %  0.01 % 
../../projects/www.worldcommunitygrid.o~86.svp.n.opt 2378613406239751723 41381 0
 5151 idle boinc 0.00 B 84.00 K  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~chem_prod_linux.x86 -float 1 -stop 43200
 5153 idle boinc 0.00 B 28.00 K  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~chem_prod_linux.x86 -float 1 -stop 43200
 5330 idle boinc 0.00 B 64.00 K  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~86.svp.n.opt 2378613406239751723 41381 0
 1951 idle boinc 0.00 B 56.00 K  0.00 %  0.00 % boinc 
--check_all_logins --redirectio --dir /var/lib/boinc-client
 5152 idle boinc 0.00 B  0.00 B  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~chem_prod_linux.x86 -float 1 -stop 43200
 5329 idle boinc 0.00 B  0.00 B  0.00 %  0.00 % 
../../projects/www.worldcommunitygrid.o~86.svp.n.opt 2378613406239751723 41381 0
 real   16m55.249s
 user   0m6.636s
 sys0m1.816s
829.65/(16*60+55)*3600/1024 = 2.87Gb/hr


ResultA: Just for the fun, I let it run
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER DISK READ  DISK WRITE  

Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-07-31 Thread franckr
Hello Steffen,

First thank you very much for your quick comments !

I am sorry that I couldn't convince you.
> The BOINC client itself does about nothing. How much IO is required just 
> depends on the scientific application.
I also thought that the IO amount may depend of the project, and not of boinc 
directly. But boinc manage the projects and I can only file bugs against boinc 
so I hope it is acceptable.

Thanks for your '$du' data, but I wish to underline that the point is the disk 
IO amount, not the footprint on the disk. So 'du' will not help you to see it 
(because you can overwrite many times the same data, du will not show you 
that). (FYI from memory I had something like 100Mb footprint used by boinc. Now 
deleted (maybe I should have kept but...))

How is your disk IO amount ?
$ iostat
( to have it you may need to #aptitude install sysstat )

> I could suggest to upstream to have the individual projects announce their 
> estimate for max disk space and disk I/O per task. 
- Yes announcing disk I/O write amount would be a good idea.
- How do you think on ? (if easy to do)
  "install by default in /home/boinc (create a specific user). This will 
increase the chances to land on a HDD and not on a SSD."
  Because I assume, normally people don't put their home on a SSD (and people 
having a laptop maybe won't run boinc).

> This would be a wishlist item, then.
I put 'important' because it has a major effect on an SSD, which for me make 
boinc unusable.
   "4 important = a bug which has a major effect on the usability of a package, 
without rendering it completely unusable to everyone."
My point was to warn other users about heavy writings involved, to avoid them 
wearing out to quickly their SSD.
I don't think this belong to 'wishlist', because nobody would see it.
I hope you can have the same opinion than me.

If you are not yet convinced by your own iostat (then you are a lucky guy... or 
I must be very unlucky), here are:
1. from my ssd smart
198 Total Read Sectors  1054749553
199 Total Write Sectors 2404159500
200 Total Read Command  18727160
201 Total Write Command 18137047
208 avg Erase   485
209 remaining life  91
As you see,  Total Write >> Total Read => To come to such Totals, I assume it 
is not linked to only one project, but to many projects.

2. boinc start-stop impact on iostat
See start-stop-boinc-impact.txt
Not so long test, but gives some ideas.

Below command also showed high write activities when boinc was running. I have 
no records, but you should just let it run for a while and look at it. Time to 
time shots of several 10Mb are written.
$ iostat -p /dev/sda 2

> You had not mentioned the project you were joining.
Sorry I didn't record which projects were runnings (2 at same time, dualcore) 
when I tested.

Looking at http://www.worldcommunitygrid.org selected projects are: 
The Clean Energy Project - Phase 2 
Help Conquer Cancer 
Human Proteome Folding - Phase 2 
FightAIDS@Home 
Discovering Dengue Drugs - Together - Phase 2 
The Clean Energy Project 
Discovering Dengue Drugs - Together  

These are the last tasks I ran
X119570355201005271402_ 0-- In Progress 7/30/11 
19:27:518/6/11 19:27:51 0.000.0 / 0.0
X119570353201005271402_ 0-- In Progress 7/30/11 
19:27:518/6/11 19:27:51 0.000.0 / 0.0
X119570311201005271403_ 0-- In Progress 7/30/11 
19:27:278/6/11 19:27:27 0.000.0 / 0.0
E202814_ 445_ C.28.C18H6N6OS2Se.00479563.4.set1d06_ 0-- In 
Progress 7/30/11 19:27:278/9/11 19:27:27 0.000.0 / 0.0
or465_ 6_ 10--  User Aborted7/30/11 19:24:047/30/11 
19:27:270.000.0 / 0.0
X119571294201005131342_ 1-- User Aborted7/30/11 
19:24:037/30/11 19:24:370.000.0 / 0.0
X119571297201005131342_ 1-- User Aborted7/30/11 
19:23:457/30/11 19:27:270.010.4 / 0.0
E202814_ 977_ C.27.C19H8N4S3Se.00538161.3.set1d06_ 0--  User Aborted
7/30/11 19:23:427/30/11 19:27:270.010.3 / 0.0
faah23578_ ZINC00631422_ x1HHPxtl_ 03_ 1--  User Aborted7/30/11 
13:14:367/30/11 16:39:550.000.1 / 0.0
E202811_ 452_ C.27.C24H14N2O.00075148.3.set1d06_ 0--User Aborted
7/30/11 11:23:297/30/11 16:39:550.113.5 / 0.0
X119500648201005260958_ 1-- Valid   7/30/11 10:57:43
7/30/11 16:39:551.0431.4 / 24.2
X119500657201005260958_ 1-- Valid   7/30/11 10:57:42
7/30/11 13:14:351.2136.3 / 27.1
faah23575_ ZINC00871887_ x1HHPxtl_ 01_ 0--  User Aborted7/30/11 
09:38:527/30/11 16:39:553.24  

Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-07-31 Thread Steffen Möller

Hello,

On 07/30/2011 10:04 PM, franck wrote:

Package: boinc-client
Version: 6.12.33+dfsg-1
Severity: important

Boinc is (too) disk intensive.


Have many thanks for your comment. You had not convinced me, though.
The BOINC client itself does about nothing. How much IO is required
just depends on the scientific application. The BOINC manager
tells about the disk space every scientific application needs.
And you can control that amount.

Here my data
$ sudo du -sh /var/lib/boinc-client/
9.4M/var/lib/boinc-client/

I could suggest to upstream to have the individual projects announce
their estimate for max disk space and disk I/O per task. I presume
you don't mind reads so much but worry about too many writes. The
user would then need to explicitly OK those values when a project
is joined. Is that what you are asking for? This would be a wishlist
item, then.

You had not mentioned the project you were joining. I could try
explaining what you observed, or might need to forward you to the
respective project's forum.

Thanks and regards,

Steffen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#636075: boinc-client: Boinc is (too) disk intensive

2011-07-30 Thread franck
Package: boinc-client
Version: 6.12.33+dfsg-1
Severity: important

Boinc is (too) disk intensive.

Hello Boinc users and maintainers,

I was shocked to see that boinc is a writing a lot on disk.
Several Gigabytes in a few hours seems usual !

This is simply too much.
As the default installation path is /var/lib/boinc-client this happens to be on 
/dev/sda in my case (and probably for most). This will just kill my SSD at the 
end.
But even by manually relocating it to a HDD, I think this is too much disk 
usage.

So I would suggest:

1. Short time countermeasure
- boinc users: check with 'iostat' you disk usage. Especially if you have SSD.  
Move boinc somewhere else if your SSD suffers.
- boinc maintainer:
-- better to put a warning somewhere before user activates boinc "Warning: 
important disk writing can occur (several Gb/hr disk) => avoid installing on 
SSD"
-- install by default in /home/boinc (create a specific user). This will 
increase the chances to land on a HDD and not on a SDD.

2. Best
Investigate why the hell does boinc need to write so much.
As writing >> reading amount, I guess it is just to backup intermediate results 
that are not needed for further calculations.


Notes:
- 'Task checkpoint to disk at most every' initially set to 60s. I tried 1000s 
but saw no change. In both cases several Mb are written before any 60s interval.
- I am not sure if this is dependant of the tasks or not.
- from my side I will deinstall boinc :-(

I do hope above will prevent others from beeing suprised of premature SSD 
failure (unfortunately SSD are silents & quick so I didn't notice anything 
before playing with iostat).

Thanks for the long reading.
Franck


-- Package-specific info:
-- Contents of /etc/default/boinc-client:
# This file is /etc/default/boinc-client, it is a configuration file for the
# /etc/init.d/boinc-client init script.

# Set this to 1 to enable and to 0 to disable the init script.
ENABLED="1"

# Set this to 1 to enable advanced scheduling of the BOINC core client and
# all its sub-processes (reduces the impact of BOINC on the system's
# performance).
SCHEDULE="1"

# The BOINC core client will be started with the permissions of this user.
BOINC_USER="boinc"

# This is the data directory of the BOINC core client.
BOINC_DIR="/var/lib/boinc-client"

# This is the location of the BOINC core client, that the init script uses.
# If you do not want to use the client program provided by the boinc-client
# package, you can specify here an alternative client program.
#BOINC_CLIENT="/usr/local/bin/boinc"
BOINC_CLIENT="/usr/bin/boinc"

# Here you can specify additional options to pass to the BOINC core client.
# Type 'boinc --help' or 'man boinc' for a full summary of allowed options.
#BOINC_OPTS="--allow_remote_gui_rpc"
BOINC_OPTS=""

-- System Information:
Debian Release: wheezy/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.0.0-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages boinc-client depends on:
ii  adduser 3.113add and remove users and groups
ii  ca-certificates 20110502 Common CA certificates
ii  debconf [debconf-2.0]   1.5.41   Debian configuration management sy
ii  libc6   2.13-11  Embedded GNU C Library: Shared lib
ii  libcurl37.21.6-3 Multi-protocol file transfer libra
ii  libgcc1 1:4.6.1-5GCC support library
ii  libssl1.0.0 1.0.0d-3 SSL shared libraries
ii  libstdc++6  4.6.1-5  GNU Standard C++ Library v3
ii  python  2.6.7-2  interactive high-level object-orie
ii  zlib1g  1:1.2.3.4.dfsg-3 compression library - runtime

Versions of packages boinc-client recommends:
ii  ia32-libs 20110609   ia32 shared libraries for use on a

Versions of packages boinc-client suggests:
pn  boinc-app-seti (no description available)
ii  boinc-manager 6.12.33+dfsg-1 GUI to control and monitor the BOI
ii  x11-xserver-utils 7.6+3  X server utilities

-- Configuration Files:
/etc/boinc-client/global_prefs_override.xml changed:

   0
   1
   0
   3.00
   55.00
   0.00
   0.00
   0.00
   0.00
   0
   0
   0
   0
   0.05
   0.40
   100.00
   120.00
   1000.00
   5.50
   50.00
   2.00
   50.00
   25.00
   50.00
   0.00
   0.00
   100.00
   0.00
   0




-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org