Great! Thank you :-)

        Juan

El 24/05/15 a las 17:51, Becky Ligon escribió:
> Juan:
> 
> We have also been able to recreate your problem with startup and
> creating the root directory information.  We are working now to put a
> fix in place.
> 
> Becky
> 
> On Fri, May 22, 2015 at 7:20 PM, Boyd Wilson <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     default mode still uses PKI to some degree, but all of the expensive
>     signing operations are done without or minimal keys, but the time
>     drift may affect it (possibly,  I will have to check with the
>     developers that are more familiar with that code).
> 
>     -b
> 
>     On Fri, May 22, 2015 at 7:09 PM Juan PC <[email protected]
>     <mailto:[email protected]>> wrote:
> 
>         Good to know :-). However, I use the default mode security (the
>         old one,
>         I think).
> 
>         Regards,
> 
>                 Juan
> 
>         El 23/05/15 a las 00:52, Boyd Wilson escribió:
>         > The new capability based security uses pki so it is time
>         dependent, so
>         > time drift could cause problems.   As far as I can tell we
>         have not
>         > documented this, so we need to do so.
>         >
>         > -b
>         >
>         > On Fri, May 22, 2015 at 6:49 PM Juan PC <[email protected]
>         <mailto:[email protected]>
>         > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>         >
>         >     Hi Becky,
>         >
>         >     When I have tried to set up an OrangeFS cluster with 4 and
>         8 nodes, the
>         >     batch_create error message has appeared again. Then, I
>         have realized
>         >     that some of my nodes had a wrong time (with a maximum
>         difference of two
>         >     hours and a half between nodes). After synchronizing the
>         times, the
>         >     batch_create problem seems to be gone. Does this make
>         sense? I mean, can
>         >     a wrong time in some servers cause the problem? I do not
>         remember seeing
>         >     any recommendation or warning about node times in the OrangeFS
>         >     documentation?
>         >
>         >     Regards,
>         >
>         >             Juan
>         >
>         >     El 16/05/15 a las 22:59, Becky Ligon escribió:
>         >     > Juan:
>         >     >
>         >     > The conf file looks good.  Can you send me your server
>         log files?
>         >     >
>         >     > Becky
>         >     >
>         >     > On Saturday, May 16, 2015, Juan PC <[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         >     > <mailto:[email protected] <mailto:[email protected]>
>         <mailto:[email protected] <mailto:[email protected]>>>> wrote:
>         >     >
>         >     >     It is attached.
>         >     >
>         >     >     I do not know if this is important, but one thing
>         that I have
>         >     seen with
>         >     >     this configuration file is that if I run the second
>         server
>         >     just after
>         >     >     running the first server, everything seems to work.
>         However,
>         >     if I wait
>         >     >     for a few seconds, the error message of the root
>         directory
>         >     appears in
>         >     >     the first server. Then, when I launch de second
>         server, I get the
>         >     >     avalanche of batch_create error messages. This
>         avalanche seems
>         >     to stop
>         >     >     when it has generated around 1 GB of data. However,
>         because of the
>         >     >     problem with the root directory, the file system
>         does not work.
>         >     >
>         >     >     I have checked if waiting for a few seconds between
>         server
>         >     executions is
>         >     >     an issue in OrangeFS 2.8.7 and it is not.
>         >     >
>         >     >     Regards,
>         >     >
>         >     >             Juan
>         >     >
>         >     >     El 16/05/15 a las 17:59, Becky Ligon escribió:
>         >     >     > Can you send me your orangefs-server.conf file?
>         >     >     >
>         >     >     > NOTE:  do not use native IB with this version.  we
>         have a
>         >     known issue
>         >     >     > with distributed directories and IB that we are
>         currently
>         >     working on.
>         >     >     >
>         >     >     > Becky
>         >     >     >
>         >     >     > On Sat, May 16, 2015 at 11:43 AM,
>         <[email protected] <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     > <mailto:[email protected]
>         <mailto:[email protected]> <mailto:[email protected]
>         <mailto:[email protected]>>
>         >     <javascript:;>>> wrote:
>         >     >     >
>         >     >     >     No, only TCP over Ethernet. We have IB NICs,
>         but I have not
>         >     >     compiled
>         >     >     >     OrangeFS with support for them.
>         >     >     >
>         >     >     >            Juan
>         >     >     >
>         >     >     >
>         >     >     >     Quoting "Becky Ligon" <[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     >     <mailto:[email protected]
>         <mailto:[email protected]> <mailto:[email protected]
>         <mailto:[email protected]>>
>         >     <javascript:;>>>:
>         >     >     >
>         >     >     >         Are you using native IB?
>         >     >     >
>         >     >     >         Becky
>         >     >     >
>         >     >     >         Sent from my iPhone
>         >     >     >
>         >     >     >             On May 15, 2015, at 5:39 PM, Juan PC
>         >     >     <[email protected] <mailto:[email protected]>
>         <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     >             <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>>> wrote:
>         >     >     >
>         >     >     >             Hi,
>         >     >     >
>         >     >     >             Well, your configuration can probably
>         avoid the
>         >     >     problem with the
>         >     >     >             benchmark, which I can not run because the
>         >     creation of the
>         >     >     >             OrangeFS fails.
>         >     >     >
>         >     >     >             The batch_create error is still there
>         because it
>         >     appears
>         >     >     >             just when I
>         >     >     >             launch the servers. The creation of
>         the root
>         >     directory
>         >     >     fails
>         >     >     >             too, as I
>         >     >     >             have mentioned. I think this is the
>         relevant part of
>         >     >     the log
>         >     >     >             messages
>         >     >     >             regarding the problem with the root
>         directory:
>         >     >     >
>         >     >     >             [D 05/15/2015 21:08:37]
>         server_post_unexpected_recv
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     server_op_state_get_machine 999
>         >     >     >             [D 05/15/2015 21:08:37] Initialization
>         completed
>         >     >     successfully.
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     >     server_state_machine_alloc_noreq 27
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     server_op_state_get_machine 27
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     server_state_machine_start_noreq
>         >     >     >             0x1d6fa10
>         >     >     >             [D 05/15/2015 21:08:37] *** Trove
>         KeyVal Read of
>         >     /dda
>         >     >     >             [D 05/15/2015 21:08:37] op_queue add:
>         0x1d71100
>         >     >     >             [D 05/15/2015 21:08:37] [DBPF THREAD]:
>         [KEYVAL
>         >     -1]: -7
>         >     >     >             [D 05/15/2015 21:08:37] [DBPF THREAD]:
>         STARTING
>         >     TROVE
>         >     >     >             SERVICE ROUTINE
>         >     >     >             (KEYVAL_READ)
>         >     >     >             [D 05/15/2015 21:08:37] warning:
>         keyval read
>         >     error on
>         >     >     handle
>         >     >     >             1048576 and
>         >     >     >             key= /dda (BDB0073 DB_NOTFOUND: No
>         matching key/data
>         >     >     pair found)
>         >     >     >             [D 05/15/2015 21:08:37] [DBPF THREAD]:
>         FINISHED
>         >     TROVE
>         >     >     >             SERVICE ROUTINE
>         >     >     >             (KEYVAL_READ) (ret: -1073742082)
>         >     >     >             [D 05/15/2015 21:08:37] op_queue add:
>         0x1d71100
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     >     server_state_machine_alloc_noreq 46
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     server_op_state_get_machine 46
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     server_state_machine_start_noreq
>         >     >     >             0x1d70f80
>         >     >     >             [D 05/15/2015 21:08:37]
>         mgmt-create-root-dir: Init
>         >     >     >             dist-dir-attr for dir
>         >     >     >             meta handle 1048576 with tree_height=1,
>         >     num_servers=2,
>         >     >     >             bitmap_size=1,
>         >     >     >             split_size=100, server_no=0 and
>         branch_level=1
>         >     >     >             [D 05/15/2015 21:08:37]
>         mgmt-create-root-dir: Init
>         >     >     >             dist_dir_bitmap as:
>         >     >     >             [D 05/15/2015 21:08:37]  i=0 : 00 00 00 03
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     >     >             [D 05/15/2015 21:08:37] creating 1
>         local dirdata
>         >     files
>         >     >     >             [D 05/15/2015 21:08:37] creating 1 remote
>         >     dirdata files
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     job_precreate_pool_get_handles:
>         >     >     >             requesting 1
>         >     >     >             handles of type 16
>         >     >     >             [E 05/15/2015 21:08:37] Warning: unable to
>         >     create root dir
>         >     >     >             due to error:
>         >     >     >             Invalid argument
>         >     >     >             [E 05/15/2015 21:08:37]          Your
>         FS may be
>         >     in an
>         >     >     >             inconsistent state
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     >     server_state_machine_complete_noreq:
>         >     >     >             0x1d70f80
>         >     >     >             [D 05/15/2015 21:08:37]
>         >     server_state_machine_terminate
>         >     >     0x1d70f80
>         >     >     >             [E 05/15/2015 21:08:43] PVFS2 server
>         got signal 15
>         >     >     >             (server_status_flag:
>         >     >     >             4177919)
>         >     >     >             [D 05/15/2015 21:08:43]
>         >     server_state_machine_terminate
>         >     >     0x1d2e970
>         >     >     >
>         >     >     >             Hope this helps.
>         >     >     >
>         >     >     >             Regards,
>         >     >     >
>         >     >     >                Juan
>         >     >     >
>         >     >     >
>         >     >     >                 El 15/05/15 a las 22:13, Becky
>         Ligon escribió:
>         >     >     >                 Juan:
>         >     >     >
>         >     >     >                 You may have hit upon another
>         problem that we've
>         >     >     >                 encountered where the
>         >     >     >                 splitting of directories goes into
>         a race
>         >     condition.
>         >     >     >                 Try this:
>         >     >     >
>         >     >     >                 1.  In your orangefs-server.conf
>         file, set
>         >     >     >                 DistrDirServersInitial 1 and
>         >     >     >                 DistrDirServersMax 1 in your
>         multi-server
>         >     >     configuration
>         >     >     >                 installation.
>         >     >     >
>         >     >     >                 2.  Delete your data and metadata
>         areas and
>         >     recreate.
>         >     >     >                 Start your servers.
>         >     >     >
>         >     >     >                 3.  Run your tests.
>         >     >     >
>         >     >     >                 See if this helps!
>         >     >     >
>         >     >     >                 NOTE:  We are working on a fix for
>         this
>         >     problem right
>         >     >     >                 now but don't have
>         >     >     >                 a working solution just yet.
>         >     >     >
>         >     >     >                 Becky
>         >     >     >
>         >     >     >                 On Fri, May 15, 2015 at 3:38 PM,
>         Juan PC
>         >     >     >                 <[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     <mailto:[email protected]
>         <mailto:[email protected]> <mailto:[email protected]
>         <mailto:[email protected]>>
>         >     <javascript:;>>
>         >     >     >                 <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     >                 <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>>>> wrote:
>         >     >     >
>         >     >     >                    Hi Becky,
>         >     >     >
>         >     >     >                    Thank you for your response :-)
>         >     >     >
>         >     >     >                    The problem is that the log
>         file grows at
>         >     a rate of
>         >     >     >                 around 2 MiB per
>         >     >     >                    second (EvenLogging is set to
>         none!) and,
>         >     more
>         >     >     >                 importantly, a simple
>         >     >     >                    pvfs2-ls does not work. The
>         latter is
>         >     probably
>         >     >     due to
>         >     >     >                 an error message
>         >     >     >                    that I get after starting the
>         server that
>         >     >     stores the
>         >     >     >                 root file system:
>         >     >     >
>         >     >     >                    [E 05/15/2015 18:38:08]
>         Warning: unable
>         >     to create
>         >     >     >                 root dir due to error:
>         >     >     >                    Resource temporarily unavailable
>         >     >     >                    [E 05/15/2015 18:38:08]       
>           Your FS
>         >     may be
>         >     >     in an
>         >     >     >                 inconsistent state
>         >     >     >
>         >     >     >                    although the batch_create
>         errors appears
>         >     after,
>         >     >     when
>         >     >     >                 a second server
>         >     >     >                    is run.
>         >     >     >
>         >     >     >                    I have spent a lot of time
>         trying different
>         >     >     >                 compilation options,
>         >     >     >                    configurations, db versions,
>         checking that I
>         >     >     run the
>         >     >     >                 right executables,
>         >     >     >                    that they use the same filesystem
>         >     configuration
>         >     >     file,
>         >     >     >                 etc., and the
>         >     >     >                    results is always the same.
>         Well, to be
>         >     honest,
>         >     >     I was
>         >     >     >                 able to activate
>         >     >     >                    the file system once (I do not
>         know how),
>         >     but it
>         >     >     >                 started failing when I
>         >     >     >                    tried to create a few thousands
>         files per
>         >     directory
>         >     >     >                 (bechmark
>         >     >     >                    hpcs-io_1.2.0-rc1, scenarios 9-12).
>         >     >     >
>         >     >     >                    My feeling is that, with two
>         servers, the
>         >     >     problematic
>         >     >     >                 sever (the one
>         >     >     >                    aimed at storing the root
>         directory) does not
>         >     >     >                 communicate correctly with
>         >     >     >                    the second server. There is no
>         firewall,
>         >     SELinux is
>         >     >     >                 disabled, etc.
>         >     >     >
>         >     >     >                    Some final remarks:
>         >     >     >                    - Security is always the
>         default one, I have
>         >     >     not used
>         >     >     >                 either
>         >     >     >                    --enable-security-key or
>         >     --enable-security-cert
>         >     >     option.
>         >     >     >                    - Same steps with OrangeFS
>         2.8.7 and not
>         >     >     problem at all.
>         >     >     >
>         >     >     >                    So I guess that I should be
>         doing something
>         >     >     terribly
>         >     >     >                 wrong, but I do not
>         >     >     >                    know what :-(
>         >     >     >
>         >     >     >                    If I can do something (for
>         instance,
>         >     running the
>         >     >     >                 servers with
>         >     >     >                    EvenLogging set to verbose),
>         just let me
>         >     know.
>         >     >     >
>         >     >     >                    Regards,
>         >     >     >
>         >     >     >                            Juan
>         >     >     >
>         >     >     >                        El 15/05/15 a las 20:12,
>         Becky Ligon
>         >     escribió:
>         >     >     >                     This is normal for 2.9.1 and
>         okay to get the
>         >     >     >                     messages you are seeing.
>         >     >     >                     batch_create comes into play
>         when a server
>         >     >     needs to
>         >     >     >                     gather more handles
>         >     >     >                     (like inodes) from another
>         server.  The
>         >     "Resource
>         >     >     >                     temporarily
>         >     >     >                     unavailable" is generated when the
>         >     capability
>         >     >     >                     associated with this
>         >     >     >                     request has timed out.  So,
>         the calling
>         >     server
>         >     >     >                     regenerates the
>         >     >     >                     capability and resends the
>         batch_create
>         >     request.
>         >     >     >
>         >     >     >                     The OFS development team is
>         changing
>         >     when these
>         >     >     >                     capabilities get
>         >     >     >                     generated for batch_create
>         requests to
>         >     alleviate
>         >     >     >                     this problem.  For now,
>         >     >     >                     you can ignore these messages.
>         >     >     >
>         >     >     >                     Sorry for the inconvenience.
>         >     >     >
>         >     >     >                     Becky
>         >     >     >
>         >     >     >
>         >     >     >
>         >     >     >                     On Fri, May 15, 2015 at 11:48
>         AM, Juan PC
>         >     >     >                     <[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     <mailto:[email protected]
>         <mailto:[email protected]> <mailto:[email protected]
>         <mailto:[email protected]>>
>         >     <javascript:;>>
>         >     >     >                     <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     >                     <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>>>
>         >     >     >                     <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     >                     <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>>
>         >     >     >                     <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>
>         >     >     >                     <mailto:[email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected] <mailto:[email protected]>>
>         <javascript:;>>>>>
>         >     >     wrote:
>         >     >     >
>         >     >     >                        Dear Becky,
>         >     >     >
>         >     >     >                        I am trying to use
>         orangefs-2.9.1, but
>         >     >     everytime
>         >     >     >                     I run the
>         >     >     >
>         >     >     >                    servers I get
>         >     >     >
>         >     >     >                        the message of the subject
>         in one of the
>         >     >     servers,
>         >     >     >                     and its log
>         >     >     >
>         >     >     >                    file grows
>         >     >     >
>         >     >     >                        very quickly. The last
>         reference that I
>         >     >     have seen
>         >     >     >                     about this
>         >     >     >
>         >     >     >                    problem is
>         >     >     >
>         >     >     >
>         >     >
>         >   
>          
> http://www.beowulf-underground.org/pipermail/pvfs2-users/2015-April/004432.html.
>         >     >     >
>         >     >     >                        I have used option
>         --disable-capcache of
>         >     >     >                     configure, but same
>         >     >     >
>         >     >     >                    result. Do
>         >     >     >
>         >     >     >                        you know if this issue has
>         been already
>         >     >     fixed or
>         >     >     >                     if there is a
>         >     >     >                        workaround?
>         >     >     >
>         >     >     >                        Best regards,
>         >     >     >
>         >     >     >                                Juan
>         >     >     >
>         >     >     >
>         >     >     >
>         >     >     >
>         >     >     >
>         >     >     >
>         >     >     >
>         >     
>         ----------------------------------------------------------------
>         >     >     >     This message was sent using IMP, the Internet
>         Messaging
>         >     Program.
>         >     >     >
>         >     >     >
>         >     >
>         >     >
>         >     >
>         >     > --
>         >     > Sent from Gmail Mobile
>         >
>         >
>         >     --
>         >     D. Juan Piernas Cánovas
>         >     Departamento de Ingeniería y Tecnología de Computadores
>         >     Facultad de Informática. Universidad de Murcia
>         >     Campus de Espinardo - 30080 Murcia (SPAIN)
>         >     Tel.: +34868887657 <tel:%2B34868887657>    Fax:
>         +34868884151 <tel:%2B34868884151>
>         >     email: [email protected] <mailto:[email protected]>
>         <mailto:[email protected] <mailto:[email protected]>>
>         >     PGP public key:
>         >   
>          
> http://pgp.rediris.es:11371/pks/lookup?search=piernas%40ditec.um.es&op=index
>         >
>         >     *** Por favor, envíeme sus documentos en formato texto,
>         HTML, PDF o
>         >     PostScript :-) ***
>         >     _______________________________________________
>         >     Pvfs2-users mailing list
>         >     [email protected]
>         <mailto:[email protected]>
>         >     <mailto:[email protected]
>         <mailto:[email protected]>>
>         >   
>          http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>         >
> 
> 
>         --
>         D. Juan Piernas Cánovas
>         Departamento de Ingeniería y Tecnología de Computadores
>         Facultad de Informática. Universidad de Murcia
>         Campus de Espinardo - 30080 Murcia (SPAIN)
>         Tel.: +34868887657 <tel:%2B34868887657>    Fax: +34868884151
>         <tel:%2B34868884151>
>         email: [email protected] <mailto:[email protected]>
>         PGP public key:
>         
> http://pgp.rediris.es:11371/pks/lookup?search=piernas%40ditec.um.es&op=index
> 
>         *** Por favor, envíeme sus documentos en formato texto, HTML, PDF o
>         PostScript :-) ***
> 
> 


-- 
D. Juan Piernas Cánovas
Departamento de Ingeniería y Tecnología de Computadores
Facultad de Informática. Universidad de Murcia
Campus de Espinardo - 30080 Murcia (SPAIN)
Tel.: +34868887657    Fax: +34868884151
email: [email protected]
PGP public key:
http://pgp.rediris.es:11371/pks/lookup?search=piernas%40ditec.um.es&op=index

*** Por favor, envíeme sus documentos en formato texto, HTML, PDF o
PostScript :-) ***
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to