[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sat, 09 Jul 2005 19:35:24 -0500
From: David Jensen [EMAIL PROTECTED]
To: Bruce Dubbs [EMAIL PROTECTED]
References: [EMAIL PROTECTED]

Bruce Dubbs wrote:

David,
  Did you add the test instructions to the Berkeley DB section?  If so,
 it must have been while I was on vacation.

  In any case, I tried to run it last night.  The system was a 2GHz P4
with 1G Ram.  When I left about 6PM it had been running several hours
and supposedly had less than an hour to go.

  When I checked on it about 2PM today (20 hours later), the system was
*very* slow.  The screen did say that all 1562 tests were done.  I
finally was able to get top working and found a load factor of 17!
There were four processes that were using 25% of the CPU each -- all
named something like tcl8.4.  When I killed these processes, I got
control of my system back.  I also had to kill several other remnant
processes.

  Do you have any idea what was going on?

  

   It always *best guesses* 1 hour!  4 processes is correct,
*run_parallel 4*. It may have finished in a few more minutes.  After it
says the tests are done, it scans the output logs and prints the errors
and failed or successful.  That does take some time.
   Now first, let me say the instructions before I got involved said:

run_parallel run_std.  Notice, no 4.  this is like make -j  It created
1582 processes and filled a 12G partition with test directories!  So I
limited it to 4.  I have a dual 2.8 with hyperthreading on, thus top
showed varying low to medium loads on the 4 pipes.
Probably we should ditch the run_parallel, just add a note that it could
be used.

I will run a test overnight without the run_parallel.  Then use that SBU.


  Also, I think it might be better to have the book use instructions
something like:

  tclsh

  From the tclsh prompt (%), run:

  source ../test/test.tcl
  run_parallel 4 run_std
  exit

  make realclean
  cd ..

  

Yes that is better.

Also there is a typo in the book.  There is cd.. which should be cd ..
(note the space).

  

I'll fix that.

--
David Jensen


-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sat, 09 Jul 2005 19:57:21 -0500
From: Bruce Dubbs [EMAIL PROTECTED]
To: David Jensen [EMAIL PROTECTED]
References: [EMAIL PROTECTED] [EMAIL PROTECTED]

David Jensen wrote:

It always *best guesses* 1 hour!  4 processes is correct, 
 *run_parallel 4*. It may have finished in a few more minutes.

When I left, I was up to test 1350 or so.  It said less than one hour
and I waited 20 :(

After it
 says the tests are done, it scans the output logs and prints the errors
 and failed or successful.  That does take some time.
Now first, let me say the instructions before I got involved said:
 
 run_parallel run_std.  Notice, no 4.  this is like make -j  It created
 1582 processes and filled a 12G partition with test directories!  So I
 limited it to 4.  

That is reasonable.  I didn't check disk space, so that may be an issue.
 Do you know were the dirs were made?  Current dir?  /tmp ?

I have a dual 2.8 with hyperthreading on, thus top
 showed varying low to medium loads on the 4 pipes.
 Probably we should ditch the run_parallel, just add a note that it could
 be used.

Not if it fills up 12G!  That doesn't seem reasonable to me.

 I will run a test overnight without the run_parallel.  Then use that SBU.

OK.  I'll try it too.

  -- Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sat, 09 Jul 2005 20:40:42 -0500
From: Bruce Dubbs [EMAIL PROTECTED]
To: David Jensen [EMAIL PROTECTED]
References: [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED]

David Jensen wrote:
 Bruce Dubbs wrote:

 Probably we should ditch the run_parallel, just add a note that it could
 be used.
   


 Not if it fills up 12G!  That doesn't seem reasonable to me.

  

 I mean used with 'number of processors'.

I've lost the thread here.  I'm not sure what you mean.

 I will run a test overnight without the run_parallel.  Then use that
 SBU.

 I started it already.  It's less than 10% loaded.
 I'm guessing 14 hours for me!

OK.  I've started too.  Right now I'm to test 150.  top shows:

top - 20:35:49 up 4 days, 19:50,  1 user,  load average: 2.89, 3.43, 2.59
Tasks:  95 total,   1 running,  94 sleeping,   0 stopped,   0 zombie
Cpu(s):  7.8% us,  3.0% sy,  0.0% ni, 54.0% id, 34.6% wa,  0.2% hi,  0.5% si
Mem:513880k total,   391984k used,   121896k free,39040k buffers
Swap:  2097140k total,  904k used,  2096236k free,78940k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 7508 bdubbs20   0 20788 8140 1808 S  6.3  1.6   0:02.50 tclsh8.4
 4587 bdubbs16   0 20088 7144 2020 S  5.3  1.4   0:09.05 tclsh8.4
 4590 bdubbs17   0 20088 7144 2020 S  5.3  1.4   0:09.35 tclsh8.4
 7874 bdubbs17   0 20792 8124 1792 S  4.6  1.6   0:02.01 tclsh8.4
 4591 bdubbs17   0 20088 7144 2020 S  3.0  1.4   0:09.12 tclsh8.4
 4594 bdubbs17   0 20088 7140 2020 S  2.3  1.4   0:09.02 tclsh8.4

To run, what I did was:

$cat  testit  EOF
#!/usr/bin/tclsh

source ../test/test.tcl
run_parallel 4 run_std
EOF

$ chmod +x testit
$ time ./testit

I'll let you know what happens.

  -- Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sat, 09 Jul 2005 21:46:46 -0500
From: David Jensen [EMAIL PROTECTED]
To: Bruce Dubbs [EMAIL PROTECTED]
References: [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]

Bruce Dubbs wrote:

I've lost the thread here.  I'm not sure what you mean.

  

you too?

top - 21:40:26 up 14:51,  3 users,  load average: 1.66, 1.29, 0.66
Tasks:  76 total,   1 running,  75 sleeping,   0 stopped,   0 zombie
Cpu0  :  4.3% us,  3.7% sy,  0.0% ni, 45.8% id, 45.8% wa,  0.3% hi,  0.0% si
Cpu1  :  2.3% us,  1.3% sy,  0.0% ni, 71.3% id, 25.0% wa,  0.0% hi,  0.0% si
Cpu2  :  0.3% us,  0.0% sy,  0.0% ni, 98.7% id,  1.0% wa,  0.0% hi,  0.0% si
Cpu3  :  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,
0.0% si
Mem:   1034612k total,   996624k used,37988k free,   166548k buffers
Swap:   987988k total,  732k used,   987256k free,   658904k cached

I am running just: 'run_std'
It has a completely different output.  no guesses, no test numbers.
after 2hours;

% run_std
Test suite run started at: 19:41 07/09/05
Sleepycat Software: Berkeley DB 4.3.28: (April 22, 2005)
Running environment tests
Running archive tests
Running file operations tests
Running locking tests
Running logging tests
Running memory pool tests
Running mutex tests
Running transaction tests
Running deadlock detection tests
Running subdatabase tests
Running byte-order tests
Running recno backing file tests
Running DBM interface tests
Running NDBM interface tests
Running Hsearch interface tests
Running secondary index tests

--
David Jensen




-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sun, 10 Jul 2005 11:00:08 -0500
From: Bruce Dubbs [EMAIL PROTECTED]
To: David Jensen [EMAIL PROTECTED]
References: [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED]

David Jensen wrote:
 Bruce Dubbs wrote:
 
 $cat  testit  EOF
 #!/usr/bin/tclsh

 source ../test/test.tcl
 run_parallel 4 run_std
 EOF

 $ chmod +x testit
 $ time ./testit

  

 I forgot to time it, however, it printed the start and end times.
 I have 236 SBU.  That seems about right, 80 SBU with run_parallel 4.
 
 Now, I have:
 UNEXPECTED OUTPUT: WARNING: log record type __db_pg_new: not tested
 UNEXPECTED OUTPUT: WARNING: log record type __db_pg_prepare: not tested
 Regression Tests Failed
 Check UNEXPECTED OUTPUT lines.
 
 I did not get these two warnings in four runs of run_parallel 4, or one
 run_parallel 10.  I will send a report to Sleepycat.
 
 Note: parallel 4 and parallel 10 ran about the same, 80 SBU.

I got (using parallel 4):

02:40:58 (00:05:00) processes running: 10286 10287 10289 10292
Starting test 1520 of 1582 parallel items.  Rough guess: less than 1
hour left.
Starting test 1530 of 1582 parallel items.  Rough guess: less than 1
hour left.
Starting test 1540 of 1582 parallel items.  Rough guess: less than 1
hour left.
Starting test 1550 of 1582 parallel items.  Rough guess: less than 1
hour left.
Starting test 1560 of 1582 parallel items.  Rough guess: less than 1
hour left.
Starting test 1570 of 1582 parallel items.  Rough guess: less than 1
hour left.
Starting test 1580 of 1582 parallel items.  Rough guess: less than 1
hour left.
Process 2: 451 commands executed in 02:59
02:45:58 (00:05:00) processes running: 10286 10289 10292
Process 4: 444 commands executed in 03:02
Process 1: 438 commands executed in 03:02
02:50:58 (00:05:00) processes running: 10289
Process 3: 249 commands executed in 03:07
All processes have exited.
Checking output from ALL.OUT.1 ... UNEXPECTED OUTPUT: g.5
ddoyscript.tcl ./TESTDIR.1 2 6 o 5
 done.
Checking output from ALL.OUT.2 ...  done.
Checking output from ALL.OUT.3 ...  done.
Checking output from ALL.OUT.4 ...  done.
Regression tests failed.
Review UNEXPECTED OUTPUT lines above for errors.
Complete logs found in ALL.OUT.x files

real190m43.728s
user138m3.860s
sys 18m12.954s


That equates to 87.9 SBU on my system.  It took about 15 minutes from
the last test to finish.

In order to automate, I had to change my testit script to:

#!/usr/bin/tclsh
source ../test/test.tcl
run_parallel 4 run_std
exit

Or else it doesn't exit and the time is invalid.

Did you get the statement Regression tests failed. ?  Looking in
TESTDIR.1 I do have the line:

g.5  ddoyscript.tcl ./TESTDIR.1 2 6 o 5

but I have no idea what it means.

  --Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sat, 09 Jul 2005 22:00:01 -0500
From: David Jensen [EMAIL PROTECTED]
To: Bruce Dubbs [EMAIL PROTECTED]
References: [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]

Bruce Dubbs wrote:

David Jensen wrote:
  

Bruce Dubbs wrote:



  

Probably we should ditch the run_parallel, just add a note that it could
be used.
  


Not if it fills up 12G!  That doesn't seem reasonable to me.

 

  

I mean used with 'number of processors'.



I've lost the thread here.  I'm not sure what you mean.

  

Oh, I think I see the confusion:
run_parallel 4 starts an overseeing process that runs 'run_std' as 4
parallel processes.  It is not required, just suggested as a speed-up.

I am getting some serious loading now in 'secondary index tests'

--
David Jensen


-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sun, 10 Jul 2005 12:29:11 -0500
From: Bruce Dubbs [EMAIL PROTECTED]
To: David Jensen [EMAIL PROTECTED]
References: [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED]

David Jensen wrote:
 Bruce Dubbs wrote:
 
 Did you get the statement Regression tests failed. ?  Looking in
 TESTDIR.1 I do have the line:

  

 Yes I did, with the single run, I had no failures on the parallel runs.
 
 g.5  ddoyscript.tcl ./TESTDIR.1 2 6 o 5

  

 Mine says:
 /usr/bin/tclsh8.4 ../dist/../test/wrap.tcl  ./TESTDIR/dead007.log.5 
 ddoyscript.tcl ./TESTDIR 2 6 o 5
 
 Yours had about half of the string clipped from the beginning, so it did
 not match a known pattern.
 .*?wrap\.tcl.*|
 that from test.tcl, line 319.

Hmm.  It seems the only problem is the loss if that half line in the
log.  I wonder what could have caused that  The only thing I can
think of is a bug (race condition?) in a shared library that writes the
code.

 I'm beginning to think they are not staying on top of this test-suite.
 I already reported one bug,  both of us have found another.
 
 Maybe recommend not running it?

Yes.  Perhaps we should write a hint detailing these issues and put a
note in the book about this testing pointing to the hint. I'm not sure
it really benefits users to run an 80 SBU test that seems to be flakey.

BTW, I'm going to post our messages to BLFS-dev.  The thread started as
a simple question, but has developed to a point where others should be
able to see it.

  -- Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page


[Fwd: Re: Berkeley DB]

2005-07-10 Thread Bruce Dubbs


 Original Message 
Subject: Re: Berkeley DB
Date: Sun, 10 Jul 2005 07:25:32 -0500
From: David Jensen [EMAIL PROTECTED]
To: Bruce Dubbs [EMAIL PROTECTED]
References: [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]

Bruce Dubbs wrote:

$cat  testit  EOF
#!/usr/bin/tclsh

source ../test/test.tcl
run_parallel 4 run_std
EOF

$ chmod +x testit
$ time ./testit

  

I forgot to time it, however, it printed the start and end times.
I have 236 SBU.  That seems about right, 80 SBU with run_parallel 4.

Now, I have:
UNEXPECTED OUTPUT: WARNING: log record type __db_pg_new: not tested
UNEXPECTED OUTPUT: WARNING: log record type __db_pg_prepare: not tested
Regression Tests Failed
Check UNEXPECTED OUTPUT lines.

I did not get these two warnings in four runs of run_parallel 4, or one
run_parallel 10.  I will send a report to Sleepycat.

Note: parallel 4 and parallel 10 ran about the same, 80 SBU.

--
David Jensen




-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page