Re: [Pvfs2-users] pvfs2 problem

Sam Lang Tue, 18 Nov 2008 08:48:28 -0800


Hi Brian,

Sorry for the delayed response! The likely cause of your errors arerelated to the servers being overloaded by clients, and the I/Ooperations taking so long that the clients cancel them after a timeoutis reached. You can crank up the timeouts if you want to perform loadtests of this kind by modifying the configure options in the PVFSconfig file. Check out:


http://www.pvfs.org/cvs/pvfs-2-7-branch-docs/doc//pvfs-config-options.php#ClientJobFlowTimeoutSecs
http://www.pvfs.org/cvs/pvfs-2-7-branch-docs/doc//pvfs-config-options.php#ClientJobBMITimeoutSecs

-sam

On Oct 10, 2008, at 10:37 AM, <[EMAIL PROTECTED]><[EMAIL PROTECTED]> wrote:


Hello,

I am trying to do some "load tests" with pvfs2, but find the following
in the logs (I produced them with 'pvfs2-set-debugmask -m /mnt/test
"network,server,client"'):

Client:

[D 11:34:10.421223] [INFO]: Mapping pointer 0x2b875cb28000 for I/O.
[D 11:34:10.433532] [INFO]: Mapping pointer 0x6a9000 for I/O.
[E 11:40:02.941501] job_time_mgr_expire: job time out: cancelling bmi
operation, job_id: 31963.

Server01:

[D 10/08 11:40] BMI_tcp_post_send_generic: Sent: 24 bytes of data.

[D 10/08 11:40] [BMI CONTROL]: BMI_set_info: set_info: 7570864option: 6

[D 10/08 11:40] [BMI CONTROL]: BMI_set_info: searching for ref 7570864

[D 10/08 11:40] [BMI CONTROL]: BMI_set_info: decremented ref 7570864to: 0

[D 10/08 11:40] server_state_machine_complete 0x2aaab4022030
[D 10/08 11:40] server_state_machine_terminate 0x2aaab4022030
[D 10/08 11:40] Error: bmi_tcp: Connection reset by peer
[D 10/08 11:40] BMI_testcontext completing: 46912585631680
[E 10/08 11:40] handle_io_error: flow proto error cleanup started on
0x2aaab0008690: Connection reset by peer
[E 10/08 11:40] handle_io_error: flow proto 0x2aaab0008690 canceled 0
operations, will clean up.

[E 10/08 11:40] handle_io_error: flow proto 0x2aaab0008690 errorcleanup

finished: Connection reset by peer

[D 10/08 11:40] [BMI CONTROL]: BMI_set_info: set_info: 7811296option: 6

[D 10/08 11:40] [BMI CONTROL]: BMI_set_info: searching for ref 7811296

[D 10/08 11:40] [BMI CONTROL]: BMI_set_info: decremented ref 7811296to: 0

[D 10/08 11:40] [BMI CONTROL]: bmi_addr_drop: bmi discarding address:
7811296
[D 10/08 11:40] server_state_machine_complete 0x2aaab40381d0

The cluster configuration is as follows:
- three hosts with ~400Gb ext3 slice each mounted from a SAN via FC
  acting as metadata servers, I/O servers and clients;
- two hosts acting as clients only.
- Debian 4.0, kernel 2.6.24, pvfs2 module 2.7.1

The hosts are connected to each other by gigabit Ethernet. I am
mounting the filesystem on each client-only host from a different
server: is this correct? What is the difference between mounting from
different servers and using one server for all clients?

Each server/client host instead uses itself as server. Again, would it
be better to use other hosts as servers?

Last, but not least: have you got any clues on the possible cause of

the error? I checked all the other logs, and are perfectly clean.Also,

pvfs2-ping doesn't report anything wrong.

Please forgive me if the above questions have already been answered: I
tried searching the mailing list archives but without success...


Thank you very much for your kind attention!


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Re: [Pvfs2-users] pvfs2 problem

Reply via email to