Good to know :-). However, I use the default mode security (the old one,
I think).
Regards,
Juan
El 23/05/15 a las 00:52, Boyd Wilson escribió:
> The new capability based security uses pki so it is time dependent, so
> time drift could cause problems. As far as I can tell we have not
> documented this, so we need to do so.
>
> -b
>
> On Fri, May 22, 2015 at 6:49 PM Juan PC <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Becky,
>
> When I have tried to set up an OrangeFS cluster with 4 and 8 nodes, the
> batch_create error message has appeared again. Then, I have realized
> that some of my nodes had a wrong time (with a maximum difference of two
> hours and a half between nodes). After synchronizing the times, the
> batch_create problem seems to be gone. Does this make sense? I mean, can
> a wrong time in some servers cause the problem? I do not remember seeing
> any recommendation or warning about node times in the OrangeFS
> documentation?
>
> Regards,
>
> Juan
>
> El 16/05/15 a las 22:59, Becky Ligon escribió:
> > Juan:
> >
> > The conf file looks good. Can you send me your server log files?
> >
> > Becky
> >
> > On Saturday, May 16, 2015, Juan PC <[email protected]
> <mailto:[email protected]>
> > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >
> > It is attached.
> >
> > I do not know if this is important, but one thing that I have
> seen with
> > this configuration file is that if I run the second server
> just after
> > running the first server, everything seems to work. However,
> if I wait
> > for a few seconds, the error message of the root directory
> appears in
> > the first server. Then, when I launch de second server, I get the
> > avalanche of batch_create error messages. This avalanche seems
> to stop
> > when it has generated around 1 GB of data. However, because of the
> > problem with the root directory, the file system does not work.
> >
> > I have checked if waiting for a few seconds between server
> executions is
> > an issue in OrangeFS 2.8.7 and it is not.
> >
> > Regards,
> >
> > Juan
> >
> > El 16/05/15 a las 17:59, Becky Ligon escribió:
> > > Can you send me your orangefs-server.conf file?
> > >
> > > NOTE: do not use native IB with this version. we have a
> known issue
> > > with distributed directories and IB that we are currently
> working on.
> > >
> > > Becky
> > >
> > > On Sat, May 16, 2015 at 11:43 AM, <[email protected]
> <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected] <mailto:[email protected]>
> <javascript:;>>> wrote:
> > >
> > > No, only TCP over Ethernet. We have IB NICs, but I have not
> > compiled
> > > OrangeFS with support for them.
> > >
> > > Juan
> > >
> > >
> > > Quoting "Becky Ligon" <[email protected]
> <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected] <mailto:[email protected]>
> <javascript:;>>>:
> > >
> > > Are you using native IB?
> > >
> > > Becky
> > >
> > > Sent from my iPhone
> > >
> > > On May 15, 2015, at 5:39 PM, Juan PC
> > <[email protected] <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>>> wrote:
> > >
> > > Hi,
> > >
> > > Well, your configuration can probably avoid the
> > problem with the
> > > benchmark, which I can not run because the
> creation of the
> > > OrangeFS fails.
> > >
> > > The batch_create error is still there because it
> appears
> > > just when I
> > > launch the servers. The creation of the root
> directory
> > fails
> > > too, as I
> > > have mentioned. I think this is the relevant part of
> > the log
> > > messages
> > > regarding the problem with the root directory:
> > >
> > > [D 05/15/2015 21:08:37] server_post_unexpected_recv
> > > [D 05/15/2015 21:08:37]
> server_op_state_get_machine 999
> > > [D 05/15/2015 21:08:37] Initialization completed
> > successfully.
> > > [D 05/15/2015 21:08:37]
> > server_state_machine_alloc_noreq 27
> > > [D 05/15/2015 21:08:37]
> server_op_state_get_machine 27
> > > [D 05/15/2015 21:08:37]
> server_state_machine_start_noreq
> > > 0x1d6fa10
> > > [D 05/15/2015 21:08:37] *** Trove KeyVal Read of
> /dda
> > > [D 05/15/2015 21:08:37] op_queue add: 0x1d71100
> > > [D 05/15/2015 21:08:37] [DBPF THREAD]: [KEYVAL
> -1]: -7
> > > [D 05/15/2015 21:08:37] [DBPF THREAD]: STARTING
> TROVE
> > > SERVICE ROUTINE
> > > (KEYVAL_READ)
> > > [D 05/15/2015 21:08:37] warning: keyval read
> error on
> > handle
> > > 1048576 and
> > > key= /dda (BDB0073 DB_NOTFOUND: No matching key/data
> > pair found)
> > > [D 05/15/2015 21:08:37] [DBPF THREAD]: FINISHED
> TROVE
> > > SERVICE ROUTINE
> > > (KEYVAL_READ) (ret: -1073742082)
> > > [D 05/15/2015 21:08:37] op_queue add: 0x1d71100
> > > [D 05/15/2015 21:08:37]
> > server_state_machine_alloc_noreq 46
> > > [D 05/15/2015 21:08:37]
> server_op_state_get_machine 46
> > > [D 05/15/2015 21:08:37]
> server_state_machine_start_noreq
> > > 0x1d70f80
> > > [D 05/15/2015 21:08:37] mgmt-create-root-dir: Init
> > > dist-dir-attr for dir
> > > meta handle 1048576 with tree_height=1,
> num_servers=2,
> > > bitmap_size=1,
> > > split_size=100, server_no=0 and branch_level=1
> > > [D 05/15/2015 21:08:37] mgmt-create-root-dir: Init
> > > dist_dir_bitmap as:
> > > [D 05/15/2015 21:08:37] i=0 : 00 00 00 03
> > > [D 05/15/2015 21:08:37]
> > > [D 05/15/2015 21:08:37] creating 1 local dirdata
> files
> > > [D 05/15/2015 21:08:37] creating 1 remote
> dirdata files
> > > [D 05/15/2015 21:08:37]
> job_precreate_pool_get_handles:
> > > requesting 1
> > > handles of type 16
> > > [E 05/15/2015 21:08:37] Warning: unable to
> create root dir
> > > due to error:
> > > Invalid argument
> > > [E 05/15/2015 21:08:37] Your FS may be
> in an
> > > inconsistent state
> > > [D 05/15/2015 21:08:37]
> > server_state_machine_complete_noreq:
> > > 0x1d70f80
> > > [D 05/15/2015 21:08:37]
> server_state_machine_terminate
> > 0x1d70f80
> > > [E 05/15/2015 21:08:43] PVFS2 server got signal 15
> > > (server_status_flag:
> > > 4177919)
> > > [D 05/15/2015 21:08:43]
> server_state_machine_terminate
> > 0x1d2e970
> > >
> > > Hope this helps.
> > >
> > > Regards,
> > >
> > > Juan
> > >
> > >
> > > El 15/05/15 a las 22:13, Becky Ligon escribió:
> > > Juan:
> > >
> > > You may have hit upon another problem that we've
> > > encountered where the
> > > splitting of directories goes into a race
> condition.
> > > Try this:
> > >
> > > 1. In your orangefs-server.conf file, set
> > > DistrDirServersInitial 1 and
> > > DistrDirServersMax 1 in your multi-server
> > configuration
> > > installation.
> > >
> > > 2. Delete your data and metadata areas and
> recreate.
> > > Start your servers.
> > >
> > > 3. Run your tests.
> > >
> > > See if this helps!
> > >
> > > NOTE: We are working on a fix for this
> problem right
> > > now but don't have
> > > a working solution just yet.
> > >
> > > Becky
> > >
> > > On Fri, May 15, 2015 at 3:38 PM, Juan PC
> > > <[email protected]
> <mailto:[email protected]> <javascript:;>
> > <mailto:[email protected] <mailto:[email protected]>
> <javascript:;>>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>>>> wrote:
> > >
> > > Hi Becky,
> > >
> > > Thank you for your response :-)
> > >
> > > The problem is that the log file grows at
> a rate of
> > > around 2 MiB per
> > > second (EvenLogging is set to none!) and,
> more
> > > importantly, a simple
> > > pvfs2-ls does not work. The latter is
> probably
> > due to
> > > an error message
> > > that I get after starting the server that
> > stores the
> > > root file system:
> > >
> > > [E 05/15/2015 18:38:08] Warning: unable
> to create
> > > root dir due to error:
> > > Resource temporarily unavailable
> > > [E 05/15/2015 18:38:08] Your FS
> may be
> > in an
> > > inconsistent state
> > >
> > > although the batch_create errors appears
> after,
> > when
> > > a second server
> > > is run.
> > >
> > > I have spent a lot of time trying different
> > > compilation options,
> > > configurations, db versions, checking that I
> > run the
> > > right executables,
> > > that they use the same filesystem
> configuration
> > file,
> > > etc., and the
> > > results is always the same. Well, to be
> honest,
> > I was
> > > able to activate
> > > the file system once (I do not know how),
> but it
> > > started failing when I
> > > tried to create a few thousands files per
> directory
> > > (bechmark
> > > hpcs-io_1.2.0-rc1, scenarios 9-12).
> > >
> > > My feeling is that, with two servers, the
> > problematic
> > > sever (the one
> > > aimed at storing the root directory) does not
> > > communicate correctly with
> > > the second server. There is no firewall,
> SELinux is
> > > disabled, etc.
> > >
> > > Some final remarks:
> > > - Security is always the default one, I have
> > not used
> > > either
> > > --enable-security-key or
> --enable-security-cert
> > option.
> > > - Same steps with OrangeFS 2.8.7 and not
> > problem at all.
> > >
> > > So I guess that I should be doing something
> > terribly
> > > wrong, but I do not
> > > know what :-(
> > >
> > > If I can do something (for instance,
> running the
> > > servers with
> > > EvenLogging set to verbose), just let me
> know.
> > >
> > > Regards,
> > >
> > > Juan
> > >
> > > El 15/05/15 a las 20:12, Becky Ligon
> escribió:
> > > This is normal for 2.9.1 and okay to get the
> > > messages you are seeing.
> > > batch_create comes into play when a server
> > needs to
> > > gather more handles
> > > (like inodes) from another server. The
> "Resource
> > > temporarily
> > > unavailable" is generated when the
> capability
> > > associated with this
> > > request has timed out. So, the calling
> server
> > > regenerates the
> > > capability and resends the batch_create
> request.
> > >
> > > The OFS development team is changing
> when these
> > > capabilities get
> > > generated for batch_create requests to
> alleviate
> > > this problem. For now,
> > > you can ignore these messages.
> > >
> > > Sorry for the inconvenience.
> > >
> > > Becky
> > >
> > >
> > >
> > > On Fri, May 15, 2015 at 11:48 AM, Juan PC
> > > <[email protected]
> <mailto:[email protected]> <javascript:;>
> > <mailto:[email protected] <mailto:[email protected]>
> <javascript:;>>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>>>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>
> > > <mailto:[email protected]
> <mailto:[email protected]> <javascript:;>>>>>
> > wrote:
> > >
> > > Dear Becky,
> > >
> > > I am trying to use orangefs-2.9.1, but
> > everytime
> > > I run the
> > >
> > > servers I get
> > >
> > > the message of the subject in one of the
> > servers,
> > > and its log
> > >
> > > file grows
> > >
> > > very quickly. The last reference that I
> > have seen
> > > about this
> > >
> > > problem is
> > >
> > >
> >
>
> http://www.beowulf-underground.org/pipermail/pvfs2-users/2015-April/004432.html.
> > >
> > > I have used option --disable-capcache of
> > > configure, but same
> > >
> > > result. Do
> > >
> > > you know if this issue has been already
> > fixed or
> > > if there is a
> > > workaround?
> > >
> > > Best regards,
> > >
> > > Juan
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> ----------------------------------------------------------------
> > > This message was sent using IMP, the Internet Messaging
> Program.
> > >
> > >
> >
> >
> >
> > --
> > Sent from Gmail Mobile
>
>
> --
> D. Juan Piernas Cánovas
> Departamento de Ingeniería y Tecnología de Computadores
> Facultad de Informática. Universidad de Murcia
> Campus de Espinardo - 30080 Murcia (SPAIN)
> Tel.: +34868887657 Fax: +34868884151
> email: [email protected] <mailto:[email protected]>
> PGP public key:
>
> http://pgp.rediris.es:11371/pks/lookup?search=piernas%40ditec.um.es&op=index
>
> *** Por favor, envíeme sus documentos en formato texto, HTML, PDF o
> PostScript :-) ***
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> <mailto:[email protected]>
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
--
D. Juan Piernas Cánovas
Departamento de Ingeniería y Tecnología de Computadores
Facultad de Informática. Universidad de Murcia
Campus de Espinardo - 30080 Murcia (SPAIN)
Tel.: +34868887657 Fax: +34868884151
email: [email protected]
PGP public key:
http://pgp.rediris.es:11371/pks/lookup?search=piernas%40ditec.um.es&op=index
*** Por favor, envíeme sus documentos en formato texto, HTML, PDF o
PostScript :-) ***
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users