erver/mgmt-get-user-cert-server.o] Error 1
make: *** Waiting for unfinished jobs....
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
d.org/pipermail/pvfs2-users/2009-April/002761.html
http://www.mail-archive.com/pvfs2-users@beowulf-underground.org/msg01735.html
~Kyle
Kyle Schochenmaier
On Tue, Jun 25, 2013 at 4:47 PM, Elaine Quarles wrote:
> Maybe someone on the list has experience running on Debian and can add to
> this
sion first.
If you're able to test out our fix to see if you still see the race
condition that would be great, as this may have been fixed already.
I'll check today and try to send you our patch.
Regards,
~Kyle
Kyle Schochenmaier
On Fri, Jan 25, 2013 at 1:27 AM, Zhang, Jingw
4/build
CFLAGS=-ggdb
(solution was to do a `make lib/libpvfs2-threaded.a
lib/libofs-threaded.a install`)
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Thanks!
Kyle Schochenmaier
On Wed, Mar 14, 2012 at 1:53 PM, Becky Ligon wrote:
> Kyle:
>
> We have made a fundamental change: the source will no long store
> ./configure. You will have to generate it using prepare. We plan to
> provide ./configure when we create new releas
4096 2012-03-09 08:26 test
~Kyle
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
evice other than for me to suggest plugging in the first device,
until this gets rewritten :-(
Unfortunately I do not have the hardware to test any of the changes
I'll make, so if you're willing once I get rewritten you can be the
guinea pig for testing.
More info to come.
Kyle Schochen
ot;%s: max %d completion queue entries", __func__, hca_cap.max_cq);
cqe_num = IBV_NUM_CQ_ENTRIES;
od->nic_max_sge = hca_cap.max_sge;
Kyle Schochenmaier
On Tue, Jan 31, 2012 at 1:33 PM, Randall Martin wrote:
> Kyle,
>
> Yes I think we need some form of fail-over capabi
ing a push into HA with orangefs soon
so I am wondering what peoples thoughts are here?
Is this something that would need to be implemented anyways, does it
fit the HA scheme that is being examined for orangefs?
Thoughts?
Kyle Schochenmaier
On Tue, Jan 31, 2012 at 2:25 AM, vlad wrote:
> D
If you have a single-node setup.. what happens if you mount pvfs2 to
/tmp/pvfs2-storage ?
This should immediately eliminate a lot of potential issues with disks at least.
Kyle Schochenmaier
On Wed, Jul 1, 2009 at 5:57 PM, David Bonnie wrote:
> Sam -
>
> All of the nodes checked out
Any failed disks anywhere?
Kyle Schochenmaier
On Jul 1, 2009 5:08 PM, "Sam Lang" wrote:
On Jul 1, 2009, at 5:05 PM, David Bonnie wrote: > Rob - > > Performance is
down across all PVFS2 in...
How many files?-sam
> > Prior to this problem we were getting ~22 MB/s w
I can confirm that I had the same 'bug' last week.
I had to fool around with hostnames ( I think I changed localhost to
the output of 'hostname' to get it to work in the conf file, but I
dont remember if that was all)
~Kyle
Kyle Schochenmaier
On Tue, Feb 24, 2009 at 10:20
aration
/usr/include/unistd.h:757: warning: shadowed declaration is here
Kyle Schochenmaier
--- /home/kyle/pvfs2/src/common/quicklist/quicklist.h 2009-01-28 11:19:22.0 -0600
+++ src/common/quicklist/quicklist.h 2009-02-18 19:50:36.0 -0600
@@ -192,7 +192,7 @@ static __inline__ vo
/* confuses debugger */
Kyle Schochenmaier
On Tue, Feb 17, 2009 at 1:32 PM, Steven Truelove wrote:
> Thanks very much!
>
> Steven
>
> Kyle Schochenmaier wrote:
>
> I just verified it builds in 2.7.1 (and also that the pvfs2-ls error
> is still there), and i get the same
http://www.beowulf-underground.org/pipermail/pvfs2-developers/2009-January/004229.html
The code that would need to be worked on is in BMI which affects both
client/server
~Kyle
Kyle Schochenmaier
On Mon, Jan 12, 2009 at 11:06 AM, Kumar, Amit H. wrote:
> Kyle,
> Just to mention:
> I
needs, in a very abstract sense.
I cant reproduce the same stuff here so i'm not sure where to go.
Kyle Schochenmaier
On Mon, Jan 12, 2009 at 10:49 AM, Kumar, Amit H. wrote:
> Yes, I built it with "disabling bmi-tcp".
> I believe, going through the list I found that
doesnt appear to be manifested here in IB.
Kyle Schochenmaier
On Mon, Jan 12, 2009 at 8:40 AM, Phil Carns wrote:
> Ah, Ok. I didn't realize that you were using infiniband. Can any IB gurus
> on the list confirm if it is responsible for the extra "sock" entries lso
I noticed this also this morning, pcarns mentioned to me that you can
back out the following patch to get it to build if that is necessary.
http://www.pvfs.org/fisheye/rdiff/PVFS?csid=MAIN:slang:20081212171229&u&N
~Kyle
Kyle Schochenmaier
On Mon, Dec 15, 2008 at 1:31 PM, Bradley Se
On Tue, Oct 14, 2008 at 11:22 AM, <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I am trying to do some "load tests" with pvfs2, but find the following
> in the logs (I produced them with 'pvfs2-set-debugmask -m /mnt/test
> "network,server,client"'):
>
> Client:
>
> [D 11:34:10.421223] [INFO]: Mapping p
art with that if its possible, just to make sure we're not
stepping on our own toes here.
Kyle
>
>>Is it possible that you somehow have a bmi_tcp client mounted somewhere?
>
> No I have double checked it.
>
>>What version of pvfs2 & kernel are you using, and do you have
>>infiniband setup properly?
>
> PVFS2: version 2.7.1
> Kernel: 2.6.18-53.1.14.el5
>
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
th-openib-includes=/usr/include --with-openib-libs=/usr/lib64 CFLAGS=-g
> --enable-nptl-workaround=yes --without-bmi-tcp
>
> Thanks!!!
> Amit
>
> -Original Message-
> From: Kyle Schochenmaier [mailto:[EMAIL PROTECTED]
> Sent: Thursday, August 28, 2008 1:55 PM
> To: Kum
it should work ??? Any thoughts please.
>
>
>
> Thank you,
>
> Amit
>
>
>
> ___
> Pvfs2-developers mailing list
> Pvfs2-developers@beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs
;>
>>> Where else should I be looking for these default settings?
>>>
>>> Thanks,
>>>~ Esteban Molina-Estolano
>>> ___
>>> Pvfs2-developers mailing list
>>> Pvfs2-developers@beowulf-underground.org
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>>
>>
>>
>>
>> --
>> Kyle Schochenmaier
>>
>
>
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
se should I be looking for these default settings?
>
> Thanks,
>~ Esteban Molina-Estolano
> ___
> Pvfs2-developers mailing list
> Pvfs2-developers@beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
--
Kyle
apped_offset(vo
num_groups,
server_ct);
}
-
+if(num_groups > server_ct)
+num_groups = server_ct;
total_stripes += global_stripes * factor;
/* if we are a server in the last group, make sure things
way inside dist-twod-stripe.c
to accomplish this?
I'd be willing to look into it further if anyone has any fancy tricks
to accomplish this.
Kyle
On Tue, Jul 1, 2008 at 5:16 PM, Sam Lang <[EMAIL PROTECTED]> wrote:
>
> On Jul 1, 2008, at 4:15 PM, Kyle Schochenmaier wrote:
>
>>
iple servers
> setting the factor and num_groups to 1 does make the 2d distribution act
> like simple-stripe, but why not just use simple stripe in that case?
>
> -sam
>
>
> On Jul 1, 2008, at 3:19 PM, Kyle Schochenmaier wrote:
>
>> Patch is against cvs head, and modifies t
TRIPE_DEFAULT_GROUPS 1
19c19
< #define PVFS_DIST_TWOD_STRIPE_DEFAULT_FACTOR 256
---
> #define PVFS_DIST_TWOD_STRIPE_DEFAULT_FACTOR 1
-
Kyle Schochenmaier
Index: include/pvfs2-dist-twod-stripe.h
===
RCS file: /anoncvs/pvfs2/include/pvfs2-dist-twod-s
ious bug report?
> http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-June/004069.html
>
> thanks,
> -Phil
>
> Kyle Schochenmaier wrote:
>> The following patch clears up some issues with pointer aliasing on
>> 64bit machines in the twod-dist code
strip_factor = *(uint32_t*)value;
---
> dparam->group_strip_factor = *(int64_t*)value;
473c473
< encode_int32_t(pptr,&dparam->group_strip_factor);
---
> encode_uint32_t(pptr,&dparam->group_strip_factor);
--
Kyle Schochenmaier
_
rding the desire for something more than
> just round-robin striping. Is anyone currently working on this? Or can
> someone point me to the code that is currently responsible for the striping?
> I'd like to help develop this part if there isn't already an effort
> underway.
&g
= twod_stripe
dist_params:
num_groups:2,strip_size:1048576,factor:256
(#3)
p5l8:/usr/src/gamess-hg# pvfs2-viewdist -f /pvfs/4node/2d/test
dist_name = twod_stripe
dist_params:
num_groups:2,strip_size:65536,factor:256
Notice the strip_size has been reverted back
seem to get data
> that was already in the file, instead of what I just wrote.
>
"When the file already exists" --- When the file already has the proper length.
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Your version of the patch seems to do the trick, works fine on my testbed.
thanks,
+=Kyle
On Tue, Apr 8, 2008 at 3:42 PM, Kyle Schochenmaier <[EMAIL PROTECTED]> wrote:
> I'll test your version of the patch tomorrow afternoon, and give a headsup.
>
> +=Kyle
>
>
>
&
I'll test your version of the patch tomorrow afternoon, and give a headsup.
+=Kyle
On Tue, Apr 8, 2008 at 3:27 PM, Pete Wyckoff <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote on Tue, 08 Apr 2008 15:03 -0500:
>
> >
> > On Apr 8, 2008, at 2:21 PM, Kyle Schoc
he, &rq->buflist);
> > # if MEMCACHE_EARLY_REG
> >/* pin on post, dereg all these */
> > - if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
> > - rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
> > + if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION)
> >memcache_deregister(ib_device->memcache, &rq->buflist);
> >if (rq->state.recv == RQ_WAITING_INCOMING
> > && rq->buflist.tot_len > ib_device->eager_buf_payload)
> > ___
> > Pvfs2-developers mailing list
> > Pvfs2-developers@beowulf-underground.org
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >
> >
>
>
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
(rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
> memcache_deregister(ib_device->memcache, &rq->buflist);
> # endif /* !MEMCACHE_EARLY_REG */
Thanks,
+=Kyle
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
s a SilverStorm/Qlogic 9120 4x DDR.
>
> Did I leave something out?
>
> Thanks again,
>
> Eric
>
>
>
>
>
>
>
>
>
> On Sun, Mar 30, 2008 at 06:35:30PM -0500, Kyle Schochenmaier wrote:
> > We are currently trying to track down this bug, as
ome clues for how to debug this further or track
> down what the problem is?
>
> Any suggestions are welcome.
>
> Thanks,
>
> Eric J. Walter
> Department of Physics
> College of William and Mary
>
>
> ___
> Pvfs2-users mailing list
> [EMAIL P
linked against an old copy of ibverbs, reconfigured and it
compiled just fine.
(also messed up the declaration in ib.h)
Got it to compile and I'll start testing it to see how close we are
getting to nailing this issue.
~~Kyle
On Mon, Mar 10, 2008 at 1:51 PM, Kyle Schochenmaier &l
cq function needs
> to be done to make some progress, or if there's some other way to back
> off out of openib_post_wr_rdma when the queues are full.
>
>
>
> Kyle Schochenmaier wrote:
> > Pete -
> >
> > We're still trying to track down this "bu
*if (wq_overflow(&qp->sq, nreq,
to_mcq(qp->ibv_qp.send_cq))) {
ret = -1;
*bad_wr = wr;
goto out;
}
--
Kyle Schochenmaier
___
Pv
re, do you know if ib spec
states this as a fatal error?
If its not a fatal error - though it looks fatal to me - can we
attempt to repost the send with a backed-off/smaller sge list?
Thanks,
Kyle
--
Kyle Schochenmaier
___
Pvfs2-developers mai
dentified some
hardware failures and removed the hardware from the test scenario and it
looks like things are looking good now I havent had a pvfs2 failure yet!
Thanks again Pete!
~Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scal
n the hardware is still
> working? And you recompiled okay? Maybe yank out the printfs in
> case there is some memory corruption going on somewhere.
>
> -- Pete
>
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> %s",
> + __func__, bh->c->peername, rq,
> + rq_state_name(rq->state.recv));
> }
>
> qlist_add_tail(&bh->list, &bh->c->eager_send_buf_free);
> @@ -599,7 +601,8 @@ encourage_recv_incoming(struc
nd that to the
client somehow?
Can you give us a brief description of the process Pete?
Thanks,
Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers
w.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
--
Kyle Schochenmaier
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
WARN|BMX_DB_ALL)
There will be a lot of output but it may point out the issue.
Scott
On Aug 2, 2007, at 2:56 PM, Kyle Schochenmaier wrote:
Sam and I looked into a problem we found with the noncontig-test that
I'm using as one of my benchmarks in my suite.
Test setup:
pvfs2-fs: MX on 4 data
Sam and I looked into a problem we found with the noncontig-test that
I'm using as one of my benchmarks in my suite.
Test setup:
pvfs2-fs: MX on 4 data servers, 5th server is the client. (CVS Head)
If I run the test using MX, it will fail, but with TCP, the test
completes, we had originally th
Sam Lang wrote:
On Jun 20, 2007, at 4:52 PM, Murali Vilayannur wrote:
Sam,
That's true. The multiple servers per node stuff was added to
genconfig just for debugging. I can't think of any good uses for
that in practice. Are there? If not, we might be able to get away
with just using a -p
this might bring about.
Just an idea I thought I'd throw out there.
Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers mailing list
Pvfs2-
0x0044cdae in PINT_state_machine_continue ()
#5 0x00410feb in main ()
And I believe the state/op from my debug file is '28', if thats at all
helpful.
Kyle
Sam Lang wrote:
On Apr 24, 2007, at 1:41 PM, Kyle Schochenmaier wrote:
Is there a reason why this is a 'fatal'
Is there a reason why this is a 'fatal' error?
I forgot to add the new install to my path, and was therefore using an
old version of pvfs2-ls, which also caused my MD server to segfault.
I think we could do a workaround on this to make it still error, but not
segfault.
The text is also goof
I got ahold of some hardware information, realized that I was running
into resource issues with our nic,
and modified the FlowBufferSizeBytes parameter in the server configs, so
far I have yet to be able to
reproduce this problem.
On a related note, I think we should add in the cache-flushing c
I can reproducibly trigger this error on the server by doing multiple
instances of pvfs2-cp over various IB hardware.
For this one, I did:
pvfs2-cp -t /pvfs2/1node/test2 /dev/null & pvfs2-cp -t
/pvfs2/1node/test2 /dev/null & pvfs2-cp -t /pvfs2/1node/test2 /dev/null
& pvfs2-cp -t /pvfs2/1nod
patch to fix a compile problem with openib, a goto statement pointed to
a label which didnt exist.
--- ../pvfs2-orig/src/io/bmi/bmi_ib/ib.c2007-02-01
17:07:48.0 -0600
+++ src/io/bmi/bmi_ib/ib.c 2007-02-13 11:29:01.0 -0600
@@ -1817,6 +1817,7 @@ static int ib_tcp_ser
1, argv=0xfcc6c38)
at src/apps/admin/pvfs2-ls.c:779
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf
version is that now we only
need one QP, not two. Not sure how this could cause the sort of
regression you are seeing (yet).
-- Pete
!DSPAM:4595716f197131497551287!
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Br
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
!DSPAM:45956664175553366512726!
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scal
ax_wr value from 2.5.1's
ibv_device_query() function.
However, 2.6.1 is reporting 0?
This looks to me now like we're tripping up the driver somehow, if the 0
is accurate.
Was there any change in order of init between the two releases?
I'm puzzled.
Kyle Schochenmaier wrote:
r you could shrink the request num_wr and
see if that helps. Either way, next stop is your IB card vendor.
-- Pete
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
On Dec 25, 2006, at 9:59 AM, Pete Wyckoff wrote:
[EMAIL PROTECTED] wrote on Fri, 22 Dec 2006 15:45 -0600:
We stumbled across some hardware issues/limitations with our IB
hardware, and have 'devised' a way of working around these
limitations by increasing the sizes of memory region allocations
Pete -
We stumbled across some hardware issues/limitations with our IB
hardware, and have 'devised' a way of working around these
limitations by increasing the sizes of memory region allocations
inside BMI, currently we're seeing chunks or blocks of 64KB being
allocated on the hardware le
On Dec 17, 2006, at 2:36 PM, Scott Atchley wrote:
Hi all,
I am ready to start testing MX support.
What do I need to do to include bmi_mx in the configure and make
process? Would it be easier to work with a cvs checkout or a
release tarball?
As for testing, I do not have disks fast enoug
On Nov 30, 2006, at 11:40 AM, Sam Lang wrote:
Hi Kyle,
I don't have a fix for your problem yet, but I think the message
about "Please make sure that the pvfs2-client is running" is
erroneous. The real error is the pvfs_bufmap_copy_iovec_from_user
error.
I'll look at this tomorrow and
em is still up?
I'm able to do regular operations just fine directly to the filesystem
via libpvfs2, however, nothing on the vfs mount. This error message
isnt very helpful to me as the client is still running, what should I
look for to debug this?
+=Kyle
--
Kyle Schochenmaier
[EMAIL
://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
!DSPAM:456db371123088992556831!
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers
2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
!DSPAM:45627f47219877879438778!
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
/* pvfs2-config.h.
@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers mailing
returned status 10.
ib_mthca :01:00.0: modify QP 3->4 returned status 10.
ib_mthca :01:00.0: modify QP 3->4 returned status 10.
ib_mthca :01:00.0: modify QP 3->4 returned status 10.
ib_mthca :01:00.0: modify QP 3->4 returned status 10.
Any ideas?
thanks
--Kyle
--
Kyle
breaking
underneath since those messages aren't helping out right now.
-- Pete
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-develope
lish to me :(
-- Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-under
1 exit status
configure:7095: $? = 1
configure: program exited with status 1
configure: failed program was:
-- Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
2:45.319433] BMI_ib_post_send_list: listlen 1 tag 32780.
[E 10:33:15.509651] job_time_mgr_expire: job time out: cancelling bmi
operation,
job_id: 235669.
[D 10:33:15.522650] BMI_cancel: cancel id 235670
[D 10:33:15.522731] test_sq: sq 0x644670 cancelled.
[D 10:33:15.522779] BMI_testcontext compl
someone may be
trying to close it twice -- but I can tell you that I always see two of
these messages??
I'm also still seeing weird error codes, and am not sure if we've
addressed this yet, but doubt thats our real problem.
hope this helps,
-- Kyle
--
Kyle Schochenmaier
[E
segfault, but I'm
thinking it may be a good idea for now until we can work out why it's
closing the connections, to put a check in there to make sure oc is
still valid.
Has anyone run into this or other issues with servers going down in openib?
-- Kyle
--
Kyle Schochenmaie
e code for right now, and there are some obvious issues
with it, but it causes the hangs quite reproducibly.
There may be a hardcode that I thought I had patched, and not ever
exactly patched in the Makefile for the pvfs2 headers.
Give it a try, it should break something, thats for sure.
-- Ky
ing
other than the startup line.
-
Pete, which level of debugging would be best to get a good log? trove
or network?
Thanks,
-- Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Ener
o fix this warning, but didnt go through and
check the vapi-IB equivalent, we may want to look into that at some
point, though not sure if its necessary.
~ Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
s.
What would you recommend for debug output, which flags?? Right now I'm
doing client and network stuff, which makes for a lot of debug to begin
with.
Thanks,
~Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computin
aad216b0 canceling a total of 1 BMI
or Trove operations
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-
ire the kernel interface, so I
think this may be a good way to get moderately reproducable non-kernel
interface effects.
I will work on getting a server-side trace out today, I think I have a
log (server+client)from yesterday where it stalled on a tab-completion
attempt, though not from this
stcontext completing: 205
[D 15:45:16.050797] trying to add object reference to acache
[D 15:45:16.050816] (0x1013c950) getattr state: getattr_cleanup
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
not entirely sure why this could/would be the case where a larger
offset of generated completions for id's would break the server, but I
have a distinct feeling that this may be the cause of it.
Any ideas?
thanks,
- Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant
the pvfs2-client interface?
thank,
- Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
___
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-undergroun
_connection() [setup.c:119] ?
And would there be any way to figure out if we're using all of the
'resources' we've specified using those settings?
(I've had to change several of them to get them to work with my HCA's)
thanks,
~Kyle
--
Kyle Schochenmaier
he SC'05 PVFS2 BOF documents, but am
unsure exactly how everything was setup, and/or how all the clients were
synchronized to get an accurate test. Any advice would be helpful as
well :)
thanks,
- Kyle
--
Kyle Schochenmaier
[EMAIL PROTECTED]
Research Assistant, Dr. Bret
utdown gracefully (cleanup and then exit 0) with
a SIGHUP (kill -1 ). There's signal handling code in the server
that is supposed to catch SIGHUP at least, and begin shutdown. Can
you try kill -1 and see if that works for you?
-sam
On Feb 20, 2006, at 5:49 PM, Kyle Schochenmaier wrote
rmally: by returning from main or by calling exit. Calling the
low-level function _exit does not write the profile data, and neither
does abnormal termination due to an unhandled signal."
I'd like to do the profile on the server-side i/o, not the client-side
test programs...
Any ideas
90 matches
Mail list logo