Hi,
Well, your configuration can probably avoid the problem with the
benchmark, which I can not run because the creation of the OrangeFS fails.
The batch_create error is still there because it appears just when I
launch the servers. The creation of the root directory fails too, as I
have mentioned. I think this is the relevant part of the log messages
regarding the problem with the root directory:
[D 05/15/2015 21:08:37] server_post_unexpected_recv
[D 05/15/2015 21:08:37] server_op_state_get_machine 999
[D 05/15/2015 21:08:37] Initialization completed successfully.
[D 05/15/2015 21:08:37] server_state_machine_alloc_noreq 27
[D 05/15/2015 21:08:37] server_op_state_get_machine 27
[D 05/15/2015 21:08:37] server_state_machine_start_noreq 0x1d6fa10
[D 05/15/2015 21:08:37] *** Trove KeyVal Read of /dda
[D 05/15/2015 21:08:37] op_queue add: 0x1d71100
[D 05/15/2015 21:08:37] [DBPF THREAD]: [KEYVAL -1]: -7
[D 05/15/2015 21:08:37] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(KEYVAL_READ)
[D 05/15/2015 21:08:37] warning: keyval read error on handle 1048576 and
key= /dda (BDB0073 DB_NOTFOUND: No matching key/data pair found)
[D 05/15/2015 21:08:37] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(KEYVAL_READ) (ret: -1073742082)
[D 05/15/2015 21:08:37] op_queue add: 0x1d71100
[D 05/15/2015 21:08:37] server_state_machine_alloc_noreq 46
[D 05/15/2015 21:08:37] server_op_state_get_machine 46
[D 05/15/2015 21:08:37] server_state_machine_start_noreq 0x1d70f80
[D 05/15/2015 21:08:37] mgmt-create-root-dir: Init dist-dir-attr for dir
meta handle 1048576 with tree_height=1, num_servers=2, bitmap_size=1,
split_size=100, server_no=0 and branch_level=1
[D 05/15/2015 21:08:37] mgmt-create-root-dir: Init dist_dir_bitmap as:
[D 05/15/2015 21:08:37] i=0 : 00 00 00 03
[D 05/15/2015 21:08:37]
[D 05/15/2015 21:08:37] creating 1 local dirdata files
[D 05/15/2015 21:08:37] creating 1 remote dirdata files
[D 05/15/2015 21:08:37] job_precreate_pool_get_handles: requesting 1
handles of type 16
[E 05/15/2015 21:08:37] Warning: unable to create root dir due to error:
Invalid argument
[E 05/15/2015 21:08:37] Your FS may be in an inconsistent state
[D 05/15/2015 21:08:37] server_state_machine_complete_noreq: 0x1d70f80
[D 05/15/2015 21:08:37] server_state_machine_terminate 0x1d70f80
[E 05/15/2015 21:08:43] PVFS2 server got signal 15 (server_status_flag:
4177919)
[D 05/15/2015 21:08:43] server_state_machine_terminate 0x1d2e970
Hope this helps.
Regards,
Juan
El 15/05/15 a las 22:13, Becky Ligon escribió:
> Juan:
>
> You may have hit upon another problem that we've encountered where the
> splitting of directories goes into a race condition. Try this:
>
> 1. In your orangefs-server.conf file, set DistrDirServersInitial 1 and
> DistrDirServersMax 1 in your multi-server configuration installation.
>
> 2. Delete your data and metadata areas and recreate. Start your servers.
>
> 3. Run your tests.
>
> See if this helps!
>
> NOTE: We are working on a fix for this problem right now but don't have
> a working solution just yet.
>
> Becky
>
> On Fri, May 15, 2015 at 3:38 PM, Juan PC <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Becky,
>
> Thank you for your response :-)
>
> The problem is that the log file grows at a rate of around 2 MiB per
> second (EvenLogging is set to none!) and, more importantly, a simple
> pvfs2-ls does not work. The latter is probably due to an error message
> that I get after starting the server that stores the root file system:
>
> [E 05/15/2015 18:38:08] Warning: unable to create root dir due to error:
> Resource temporarily unavailable
> [E 05/15/2015 18:38:08] Your FS may be in an inconsistent state
>
> although the batch_create errors appears after, when a second server
> is run.
>
> I have spent a lot of time trying different compilation options,
> configurations, db versions, checking that I run the right executables,
> that they use the same filesystem configuration file, etc., and the
> results is always the same. Well, to be honest, I was able to activate
> the file system once (I do not know how), but it started failing when I
> tried to create a few thousands files per directory (bechmark
> hpcs-io_1.2.0-rc1, scenarios 9-12).
>
> My feeling is that, with two servers, the problematic sever (the one
> aimed at storing the root directory) does not communicate correctly with
> the second server. There is no firewall, SELinux is disabled, etc.
>
> Some final remarks:
> - Security is always the default one, I have not used either
> --enable-security-key or --enable-security-cert option.
> - Same steps with OrangeFS 2.8.7 and not problem at all.
>
> So I guess that I should be doing something terribly wrong, but I do not
> know what :-(
>
> If I can do something (for instance, running the servers with
> EvenLogging set to verbose), just let me know.
>
> Regards,
>
> Juan
>
> El 15/05/15 a las 20:12, Becky Ligon escribió:
> > This is normal for 2.9.1 and okay to get the messages you are seeing.
> > batch_create comes into play when a server needs to gather more handles
> > (like inodes) from another server. The "Resource temporarily
> > unavailable" is generated when the capability associated with this
> > request has timed out. So, the calling server regenerates the
> > capability and resends the batch_create request.
> >
> > The OFS development team is changing when these capabilities get
> > generated for batch_create requests to alleviate this problem. For now,
> > you can ignore these messages.
> >
> > Sorry for the inconvenience.
> >
> > Becky
> >
> >
> >
> > On Fri, May 15, 2015 at 11:48 AM, Juan PC <[email protected]
> <mailto:[email protected]>
> > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >
> > Dear Becky,
> >
> > I am trying to use orangefs-2.9.1, but everytime I run the
> servers I get
> > the message of the subject in one of the servers, and its log
> file grows
> > very quickly. The last reference that I have seen about this
> problem is
> >
>
> http://www.beowulf-underground.org/pipermail/pvfs2-users/2015-April/004432.html.
> > I have used option --disable-capcache of configure, but same
> result. Do
> > you know if this issue has been already fixed or if there is a
> > workaround?
> >
> > Best regards,
> >
> > Juan
> >
> >
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users