Hi,
This is a follow up on an earlier email where I reported that PVFS
1.5.1 failed copy binary files from several DVD's.

I'm running a 3 node Rocks 4.2.1 Cluster, CentOS4.4, x86_64, nodes are
connected via an unmanaged switch.

I have reinstalled the Rocks Cluster (all nodes), including the PVFS2 Roll.
The cluster is set up with the frontend as the metadaat server and the
other two nodes are PVFS2 I/O servers and clients.  The /mnt.pvfs2
area is on a 3 disk RAID 0 partition formatted as ext3.
After installing I ran the test steps in the "PVFS2 Quick Start
Guide". The test steps ran without error.
I upgraded to PVFS 2.6.2 on all nodes and re-ran the test steps, again
no errors or problems.

I build PVFS 2.6.2 with the following:

./configure --with-kernel=</path/to/kernel26/>
--enable-kernel-sendfile --prefix=/usr/local/pvfs2/
then type
make all
make kmod_install
make install

On each node I have a script that lists the files on the DVD disc
loaded on that node.
Each file is copied if it does not exist on the HDD (PVFS area) and
the copy is immediately verified:

cp /dvd/file1 /mnt/pvfs2/file1
cmp /dvd/file1 /mnt/pvfs2/file1

`cmp` does not report any error.
This has been done for 60-70 DVD.

If I insert a DVD that has previously been copied my script finds that
a file exists in the PVFS area and does a `cmp` with the DVD file, if
the file fails this comparison the file is deleted, copied, verified
(cmp).

I notice that frequently and randomly the previously copied files will
fail the _initial_ `cmp` check if more than one node is 'active', i.e.
processing a DVD.
Once deleted and copied the second `cmp` check is passed.

Some details:
The files do not fail the `cmp` check immediately after being copied -
only when checking a previously copied file.
The `cmp` result indicates a differnt byte at which the files differ.
re-inserting the same dvd several times results if different files
failing the first `cmp` check.
The second check (immediately after the copy is finished) is always passed.
This occurs rarely, if at all (i.e. I haven't noticed it), when only
one node is processing a DVD.
This only occurs with binary files - which are relatively large 200MB - 2 GB
This never occurs with text files - which are also small 100'sKB
The pvfs2-client.log file is empty on each node.
I have tried using diff and experience the same results.

This is similar to an error I was seeing in PVFS 1.5.1 - hence the
upgrade.  I've also changed my previous script which `dd` copied the
DVD to memory (approx 8GB), then wrote this iso file to the PVFS2 area
- this worked fine for initial copies, but failed for re-copies.  At
that time I wasn't verifiying the copy, so it was the copy to the
PVFS2 area that failed.....

Finally, on one occasion when manually running `cmp` on a file I
noticed the following sequence.
cmp file1 file2 (pass)
cmp file1 file2 (pass)
difffile1 file2 (fail)
cmp file1 file2 (fail)

Is this known behavior with a known workaround/configuration setting?
The behavior I see made me guess a caching or network issue (there are
no other machines on the cluster network).
Can anyone suggest PVFS configuration settings that will make PVFS more robust.

I'm not a programmer or linux guru - I just spent this summer
converting from winxp...
I'm happy to explore some possible fixes, but don't assume too much :)

Thanks in advance
Mark
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to